📜 ⬆️ ⬇️

Improving performance: boxing in .NET that can be avoided

We are developing a C # server in our project. This server must withstand very high loads, for this reason we try to write the code as best as possible. C # is rarely associated with high performance, but if you are wise to develop, you can achieve a very good level.

One of the expensive processes in terms of performance is boxing and unboxing. A reminder of what it is, can be found here . Recently, I decided to look at the whole IL code of our projects and look for instructions for box and unbox. There were many plots in which boxing can be avoided with a flick of the wrist. All cases leading to unnecessary boxing are obvious, and are allowed by inattention at the moment of concentration on functionality, and not on optimization. I decided to write out the most common cases, so as not to forget about them, and then to automate their correction. This article lists these cases.

Immediately I will make a remark: most often, the performance problems are at a higher level, and before you edit all the extra boxing, you need to bring the code to a state where it will make sense. It makes sense to think seriously about things like boxing if you really want to get the most out of C #.

May the Russian language forgive me, but later in the article I will use the word “boxing”, unexpected for him, so that the eye does not cling once again in an attempt to find a line of code.
')
Let's get started

1. Passing the value type variables to String.Format, String.Concat, etc.

The first place in the number of boxing hold string operations. Fortunately, in our code, this was mainly in the formatting of the message for exceptions. The main rule to avoid boxing is to call ToString () on the value type of a variable before using it in the String.Format methods or when adding strings.

The same, but in the code. Instead:

var id = Guid.NewGuid(); var str1 = String.Format("Id {0}", id); var str2 = "Id " + id; 


 IL_0000: call valuetype [mscorlib]System.Guid [mscorlib]System.Guid::NewGuid() IL_0005: stloc.0 IL_0006: ldstr "Id {0}" IL_000b: ldloc.0 IL_000c: box [mscorlib]System.Guid IL_0011: call string [mscorlib]System.String::Format(string, object) IL_0016: pop IL_0017: ldstr "Id " IL_001c: ldloc.0 IL_001d: box [mscorlib]System.Guid IL_0022: call string [mscorlib]System.String::Concat(object, object) 


Need to write:

 var id = Guid.NewGuid(); var str1 = String.Format("Id {0}", id.ToString()); var str2 = "Id " + id.ToString(); 


 IL_0000: call valuetype [mscorlib]System.Guid [mscorlib]System.Guid::NewGuid() IL_0005: stloc.0 IL_0006: ldstr "Id {0}" IL_000b: ldloca.s id IL_000d: constrained. [mscorlib]System.Guid IL_0013: callvirt instance string [mscorlib]System.Object::ToString() IL_0018: call string [mscorlib]System.String::Format(string, object) IL_001d: pop IL_001e: ldstr "Id " IL_0023: ldloca.s id IL_0025: constrained. [mscorlib]System.Guid IL_002b: callvirt instance string [mscorlib]System.Object::ToString() IL_0030: call string [mscorlib]System.String::Concat(string, string) 


As we can see, the statement constrained instead of box appears. It says here that the next call to the callvirt will be directly on the variable, provided that thisType is the value type and is the implementation of the method. If there is no method implementation, then boxing will still happen.

The unpleasant point is that almost everyone has Resharper, which suggests that the call to ToString () is superfluous.

And about the lines, or rather their addition. Sometimes I met code like:

 var str2 = str1 + '\t'; 


There is a false feeling that char will add up with a string without any problems, but char is a value type, so boxing will also be here. In this case, it’s still better to write like this:

 var str2 = str1 + "\t"; 


2. Calling methods on generic variables

Second place in the number of boxing keep generic methods. The fact is that any method call on a generic variable causes boxing, even if the constraint class .

Example:

 public static Boolean Equals<T>(T x, T y) where T : class { return x == y; } 


Turns into:

 IL_0000: ldarg.0 IL_0001: box !!T IL_0006: ldarg.1 IL_0007: box !!T IL_000c: ceq 


In fact, not everything is so bad here, since this IL code will be optimized by JIT, but the case is interesting.

A positive point is that the already constrained instruction is used to call methods on generic variables, and this allows calling methods on value types without boxing. If the method works with both value types and reference types, for example, a comparison to null is better written as:

 if (!typeof(T).IsValueType && value == null) // Do something 


There is also a problem with the as operator. It is a typical practice to immediately cast using the as operator instead of checking for the type and cast to it. But if you can have a value type, then it is better to check for the type first and then to bring it, because the as operator works only with reference types, and boxing will occur first, and then the isinst call will isinst .

3. Calls to enumeration methods

Enumerations in C # are very sad. The problem is that any method call to the enumeration causes boxing:

 [Flags] public enum Flags { First = 1 << 0, Second = 1 << 1, Third = 1 << 2 } public Boolean Foo(Flags flags) { return flags.HasFlag(Flags.Second); } 


 IL_0000: ldarg.1 IL_0001: box HabraTests.Flags IL_0006: ldc.i4.2 IL_0007: box HabraTests.Flags IL_000c: call instance bool [mscorlib]System.Enum::HasFlag(class [mscorlib]System.Enum) 


Moreover, even the GetHashCode () method causes boxing. Therefore, if you suddenly need a hash code from the enumeration, first make a cast to its underlying type. And also, if you suddenly use enumeration as a key in a Dictionary, then make your own IEqualityComparer, otherwise every time you call GetHashCode () you will have boxing.

4. Enumerations in generic methods

The logical continuation of points 2 and 3 is the desire to see, and how the listing in the generic method will behave. On the one hand, if there is a method implementation for the value type, then generic methods are able to call interface methods on structures without boxing. On the other hand, all implementations of methods exist in the base class Enum , and not in our created enums. Let's write a small test to understand what is going on inside.

Test code
 public static void Main() { Double intAverageGrow, enumAverageGrow; Int64 intMinGrow, intMaxGrow, enumMinGrow, enumMaxGrow; var result1 = Test<Int32>(() => GetUlong(10), out intAverageGrow, out intMinGrow, out intMaxGrow); var result2 = Test<Flags>(() => GetUlong(Flags.Second), out enumAverageGrow, out enumMinGrow, out enumMaxGrow); Console.WriteLine("Int32 memory change. Avg: {0}, Min: {1}, Max: {2}", intAverageGrow, intMinGrow, intMaxGrow); Console.WriteLine("Enum memory change. Avg: {0}, Min: {1}, Max: {2}", enumAverageGrow, enumMinGrow, enumMaxGrow); Console.WriteLine(result1 + result2); Console.ReadKey(true); } public static UInt64 GetUlong<T>(T value) where T : struct, IConvertible { return value.ToUInt64(CultureInfo.InvariantCulture); } public static UInt64 Test<T>(Func<UInt64> testedMethod, out Double averageGrow, out Int64 minGrow, out Int64 maxGrow) { GCSettings.LatencyMode = GCLatencyMode.SustainedLowLatency; var previousTotalMemory = GC.GetTotalMemory(false); Int64 growSum = 0; minGrow = 0; maxGrow = 0; UInt64 sum = 0; for (var i = 0; i < 100000; i++) { sum += testedMethod(); var currentTotalMemory = GC.GetTotalMemory(false); var grow = currentTotalMemory - previousTotalMemory; growSum += grow; if (minGrow > grow) minGrow = grow; if (maxGrow < grow) maxGrow = grow; previousTotalMemory = currentTotalMemory; } averageGrow = growSum / 100000.0; return sum; } 



Result:

 Int32 memory change. Avg: 0, Min: 0, Max: 0 Enum memory change. Avg: 3,16756, Min: -2079476, Max: 8192 


As we can see, with enumerations and everything is not thanks to God: boxing happens every time the ToUInt64 () method is called. But on the other hand, it is clearly seen that calling Int32 in the interface method does not cause any boxing.

And at the end, and partly as a conclusion, I would like to add that value types are great to help raise productivity, but you need to carefully monitor how they are used, otherwise as a result of boxing their main advantage will be leveled.
In the next article I would like to talk about places where global synchronization points are not obvious, and how to get around them. Stay tuned.

Source: https://habr.com/ru/post/229741/


All Articles