📜 ⬆️ ⬇️

How to close a variable in C # and not shoot yourself in the foot

Back in 2005, with the release of the C # 2.0 standard, it was possible to transfer a variable to the body of an anonymous delegate by capturing it (or closing it to anyone) from the current context. In 2008, the new C # 3.0 standard came out, bringing us lambdas, custom anonymous classes, LINQ queries and much more. Now it's January 2017 and most of the C # developers are looking forward to the release of the standard C # 7.0, which should bring many new useful features. But fix the old "features", no one is in a hurry. Therefore, there are still enough ways to accidentally shoot yourself in the foot. Today we will talk about one of them, and it is connected with a not quite obvious mechanism for capturing variables in the body of anonymous functions in C #.

Picture 1


Introduction


As I wrote above, in this article we will discuss the peculiarities of the work of the mechanism for capturing variables in the body of anonymous methods in the C # language. I just want to make a reservation that this article will contain a lot of technical details, but I hope that I will be able to easily and interestingly tell both experienced and novice developers about this.

And now more to the point. I will write a simple code sample, and you will need to say exactly what will be displayed in the console in this case.
')
And so, let's get started:

void Foo() { var actions = new List<Action>(); for (int i = 0; i < 10; i++) { actions.Add(() => Console.WriteLine(i)); } foreach(var a in actions) { a(); } } 

And now attention, answer:

Answer
The console will display ten times the number ten:
 10 10 10 10 10 10 10 10 10 10 

This article is for those who thought otherwise. Let's look at the reasons for this behavior.

Why it happens?


When declaring an anonymous function (it can be an anonymous delegate or lambda) inside your class, at the compilation stage another container class will be declared containing fields for all captured variables and a method containing the body of the anonymous function. For the above code section, the disassembled structure of the program after compilation will look like this:

Picture 3



In this case, the Foo method from the code shown at the beginning of the section is declared inside the Program class. For lambda () => Console.WriteLine (i), the compiler generated the container class c__DisplayClass1_0 , and inside it - the field i containing the same captured variable and the b__0 method containing the lambda body.

Let's look at the disassembled IL method code b__0 (lambda body) with my comments:

Some IL code
 .method assembly hidebysig instance void '<Foo>b__0'() cil managed { .maxstack 8 //        ( 'this'). //        . IL_0000: ldarg.0 //       'i' //   . IL_0001: ldfld int32 TestSolution.Program/'<>c__DisplayClass1_0'::i //      . //       . IL_0006: call void [mscorlib]System.Console::WriteLine(int32) //   . IL_000b: ret } 

That's right, this is exactly what we do inside the lambda, no magic. Go ahead.

As is known, the int type (the full name is Int32 ) is a structure, which means that when a transfer is made, the reference to it is not transferred to memory, but its value is copied directly.

The value of the variable i must be copied (logically) during the creation of an instance of the container class. And if you answered incorrectly to my question at the beginning of the article, then most likely you expected that the container would be created immediately before declaring the lambda in the code.

In fact, the variable i after compilation will not be created at all inside the Foo method. Instead, an instance of the container class c__DisplayClass1_0 will be created, and its field i will be initialized instead of the local variable i with value 0. Moreover, wherever before we used the local variable i , the container class field is now used.

The important point is that an instance of the container class will be created before the loop, since its i field will be used in the loop as an iterator.

As a result, we get one instance of the container class for all iterations of the for loop. And by adding a new lambda to the list of actions at each iteration, we, in fact, add the same link to the previously created instance of the container class. As a result, when we loop around the foreach loop all the elements of the actions list, they all contain the same instance of the container class. And if we take into account that the for loop executes an increment to the iterator value after each iteration (even after the last one), then the value of field i within the container class after exiting the loop becomes ten after executing the for loop.

I can be convinced of all of the above by looking at the disassembled IL code of the Foo method (with my comments, of course):

Careful IL code
 .method private hidebysig instance void Foo() cil managed { .maxstack 3 // -==========    ==========- .locals init( //  'actions'. [0] class [mscorlib]System.Collections.Generic.List'1 <class [mscorlib]System.Action> actions, // -  . [1] class TestSolution.Program/ '<>c__DisplayClass1_0' 'CS$<>8__locals0', //   V_2    //    . [2] int32 V_2, //   V_3    //   'actions'     'foreach'. [3] valuetype [mscorlib]System.Collections.Generic.List'1/Enumerator<class [mscorlib]System.Action> V_3) // -=================  =================- //    Actions   //  'actions'. IL_0000: newobj instance void class [mscorlib]System.Collections.Generic.List'1<class [mscorlib]System.Action>::.ctor() IL_0005: stloc.0 //   -  //     . IL_0006: newobj instance void TestSolution.Program/'<>c__DisplayClass1_0'::.ctor() IL_000b: stloc.1 //      -. IL_000c: ldloc.1 //  0   . IL_000d: ldc.i4.0 //     0  'i'  //    ( -). IL_000e: stfld int32 TestSolution.Program/'<>c__DisplayClass1_0'::i // -=================  FOR =================- //    IL_0037. IL_0013: br.s IL_0037 //      'actions'  //  -. IL_0015: ldloc.0 IL_0016: ldloc.1 //       'Foo' //  -. IL_0017: ldftn instance void TestSolution.Program/'<>c__DisplayClass1_0'::'<Foo>b__0'() //    'Action'     //    'Foo'  -. IL_001d: newobj instance void [mscorlib]System.Action::.ctor(object, native int) //   'Add'   'actions'  //     'Action'. IL_0022: callvirt instance void class [mscorlib]System.Collections.Generic.List'1<class [mscorlib]System.Action>::Add(!0) //      'i'  // -. IL_0027: ldloc.1 IL_0028: ldfld int32 TestSolution.Program/'<>c__DisplayClass1_0'::i //    'V_2'   'i'. IL_002d: stloc.2 //       - //     'V_2'. IL_002e: ldloc.1 IL_002f: ldloc.2 //     1. IL_0030: ldc.i4.1 //          . IL_0031: add //       'i'. // (  ) IL_0032: stfld int32 TestSolution.Program/'<>c__DisplayClass1_0'::i //    'i'  // -  . IL_0037: ldloc.1 IL_0038: ldfld int32 TestSolution.Program/'<>c__DisplayClass1_0'::i //     10. IL_003d: ldc.i4.s 10 //    'i'   10, //     IL_0015. IL_003f: blt.s IL_0015 // -=================  FOREACH =================- //       'actions'. IL_0041: ldloc.0 //   V_3   //   'GetEnumerator'   'actions'. IL_0042: callvirt instance valuetype [mscorlib]System.Collections.Generic.List'1/Enumerator<!0> class [mscorlib]System.Collections.Generic.List'1<class [mscorlib]System.Action>::GetEnumerator() IL_0047: stloc.3 //   try ( foreach  //   try-finally). .try { //    IL_0056. IL_0048: br.s IL_0056 //    V_3  get_Current. //    . // (   Action   ). IL_004a: ldloca.s V_3 IL_004c: call instance !0 valuetype [mscorlib]System.Collections.Generic.List'1/Enumerator<class [mscorlib]System.Action>::get_Current() //    Action    Invoke. IL_0051: callvirt instance void [mscorlib]System.Action::Invoke() //    V_3  MoveNext. //    . IL_0056: ldloca.s V_3 IL_0058: call instance bool valuetype [mscorlib]System.Collections.Generic.List'1/Enumerator<class [mscorlib]System.Action>::MoveNext() //     MoveNext  null, //     IL_004a. IL_005d: brtrue.s IL_004a //    try    finally. IL_005f: leave.s IL_006f } // end .try finally { //    V_3  Dispose. IL_0061: ldloca.s V_3 IL_0063: constrained. Valuetype [mscorlib]System.Collections.Generic.List'1/Enumerator<class [mscorlib]System.Action> IL_0069: callvirt instance void [mscorlib]System.IDisposable::Dispose() //    finally. IL_006e: endfinally } //    . IL_006f: ret } 

Conclusion


Comrades from Microsoft argue that this is not a bug, but a feature, and this behavior was deliberately implemented, in order to increase the productivity of the programs. More information on the link . In fact, this translates into bugs, and misunderstanding on the part of novice developers.

An interesting fact is that the foreach loop had a similar behavior up to the standard C # 5.0. Microsoft was literally bombarded with non-intuitive behavior in a bug tracker, and then with the release of the C # 5.0 standard, this behavior was changed by declaring an iterator variable inside each iteration of the loop, not at the compilation stage, but for all other loop constructs changes. Read more about this at the link in the Breaking Changes section.

You ask how to avoid this error? In fact, the answer is very simple. You need to keep track of where and what variables you are capturing. Remember, a container class will be created where you declared your variable, which you will later capture. If the capture occurs in the body of the loop, and the variable is declared outside of it, then it is necessary to reassign it inside the loop body to a new local variable. The correct version of the example at the beginning could look like this:

 void Foo() { var actions = new List<Action>(); for (int i = 0; i < 10; i++) { var index = i; // <= actions.Add(() => Console.WriteLine(index)); } foreach(var a in actions) { a(); } } 

If you execute this code, the numbers from 0 to 9 will be displayed in the console as expected:

Console output
 0 1 2 3 4 5 6 7 8 9 

Looking at the IL code of the for loop from this example, we will see that an instance of the container class will be created every iteration of the loop. Thus, the list of actions will contain links to different instances with valid iterator values.

Some more IL code
 // -=================  FOR =================- //    IL_002d. IL_0008: br.s IL_002d //   -      IL_000a: newobj instance void TestSolution.Program/'<>c__DisplayClass1_0'::.ctor() IL_000f: stloc.2 IL_0010: ldloc.2 //   'index'  - //   'i'. IL_0011: ldloc.1 IL_0012: stfld int32 TestSolution.Program/'<>c__DisplayClass1_0'::index //    'Action'     // -      'actions'. IL_0017: ldloc.0 IL_0018: ldloc.2 IL_0019: ldftn instance void TestSolution.Program/'<>c__DisplayClass1_0'::'<Foo>b__0'() IL_001f: newobj instance void [mscorlib]System.Action::.ctor(object, native int) IL_0024: callvirt instance void class [mscorlib]System.Collections.Generic.List'1<class [mscorlib]System.Action>::Add(!0) //     'i' IL_0029: ldloc.1 IL_002a: ldc.i4.1 IL_002b: add IL_002c: stloc.1 //      'i'. //        -. IL_002d: ldloc.1 //    'i' c  10. //  'i < 10',     IL_000a. IL_002e: ldc.i4.s 10 IL_0030: blt.s IL_000a 

Finally, let me remind you that we are all human, and we all make mistakes, and it’s not only logical to rely on the human factor when looking for errors and typos in the code, but as a rule it’s also long and resource-intensive. Therefore, it always makes sense to use technical solutions to find and identify errors in your code. The machine not only does not know fatigue, but often performs the work faster.

Most recently, we, the developers of the PVS-Studio static analyzer, have implemented a regular diagnostics aimed at finding errors of incorrectly capturing variables in anonymous functions inside loops. In my turn, I hasten to invite you to check your code for errors and typos with our static analyzer.

On this note, I end this article, and I wish you a clean code and reckless programs.



If you want to share this article with an English-speaking audience, then please use the link to the translation: Ivan Kishchenko. How to capture a variable in c #

Read the article and have a question?
Often our articles are asked the same questions. We collected answers to them here: Answers to questions from readers of articles about PVS-Studio, version 2015 . Please review the list.

Source: https://habr.com/ru/post/320588/


All Articles