Many developers of programming languages, libraries, and classes of simple applications tend to the intuitive interface of the classes created. Scott Meyers, a dozen years ago, said that we should strive to develop classes (libraries, languages) that are easy to use correctly and difficult to use incorrectly.
If we talk about the language C #, then its developers approach the issues of "usability" very thoroughly; they can easily sacrifice “object purity” for the sake of common sense and usability. One of the few exceptions to this rule is the closure of a loop variable, the same feature that behaves differently than many developers believe. At the same time, the amount of discontent and misunderstanding is so much that in the 5th version of the C # language this behavior was decided to be changed.
So let's take a look at sample code that shows the closure problem to a loop variable:
var actions = new List<Action>(); foreach(var i in Enumerable.Range(1, 3)) { actions.Add(() => Console.WriteLine(i)); } foreach(var action in actions) { action(); }
')
Most developers reasonably assume that the result of executing this code will be “1 2 3”, because at each iteration of the cycle we add an anonymous method to the list that displays the new value of
i . However, if you run this code snippet in VS2008 or VS2010, then we get “3 3 3”. This problem is so typical that some tula, for example, ReSharper, gives a warning in the
actions string
. Add () says that we are capturing the variable being changed, and Eric Lippert is so tired of answering everyone that this is a feature and not a bug that he decided to change the existing behavior in C # 5.0.
To understand why this piece of code behaves exactly this way, and not otherwise, let's consider what the compiler is deploying this code (I will not go too far into the details of the workings of closures in C #, for details, see the note
“Closures in C # " ).
In C #, the capture of external variables is carried out "by reference", and in our case this means that the variable
i disappears from the stack and becomes the field of a specially generated class, into which the body of the anonymous method is then placed:
// - class Closure { public int i; public void Action() { Console.WriteLine(i); } } var actions = new List<Action>(); using (var enumerator = Enumerable.Range(1, 3).GetEnumerator()) { // int current; // var closure = new Closure(); while(enumerator.MoveNext()) { // current = enumerator.Current; // foreach closure.i = enumerator.Current; var action = new Action(closure.Action); actions.Add(action); } } foreach (var action in actions) { action(); }
Since a single
Closure object is used inside the loop, after the first loop is completed, it is
closure . i will be equal to
3 , and since the
actions variable contains three references to the same
Closure object, it is not surprising that on a subsequent call to
closure methods
. Action () we get on the screen “3 3 3”.
Changes in C # 5.0
Changes in C # 5.0 do not relate to closures as such, and we, as we closed on variables (and do not make a copy of the values), also we close. In fact, the changes relate to what the
foreach loop is turning into. Closures in C # are implemented in such a way that for each scope (scope) that contains the variable to be captured, a separate instance of the closure class is created. That is why, in order to get the desired behavior in previous versions of the C # language, it was enough to write the following:
var actions = new List<Action>(); foreach(var i in Enumerable.Range(1, 3)) { var tmp = i; actions.Add(() => Console.WriteLine(tmp)); }
If we go back to our simplified example with the
Closure class, then this change results in the creation of a new
Closure instance inside the
while loop , which saves the desired value of the variable
i :
using (var enumerator = Enumerable.Range(1, 3).GetEnumerator()) { int current; while(enumerator.MoveNext()) { current = enumerator.Current; // // Closure i var closure = new Closure {i = current}; var action = new Action(closure.Action); actions.Add(action); } }
In C # 5.0, we decided to change the
foreach loop so that at each iteration of the loop, the variable
i is created again. In fact, in previous versions of the C # language, there was only one loop variable in the
foreach loop, and since C # 5.0, a new variable is used for each iteration.
Now the initial
foreach loop unfolds differently:
using (var enumerator = Enumerable.Range(1, 3).GetEnumerator()) { // C# 3.0 4.0 current //int current; while (enumerator.MoveNext()) { // C# 5.0 current var current = enumerator.Current; actions.Add(() => Console.WriteLine(current)); } }
This makes the temporary variable inside the
foreach loop redundant (since the compiler added it to us), and when we run this code, we get the expected “1 2 3”.
By the way, note that this change only concerns the
foreach loop , the behavior of the
for loop has not changed at all, and when you capture a loop variable, you still have to create a temporary variable yourself within each iteration.
Additional links
- Eric Lippert Closing over loop variable considered harmful
- Eric Lippert Closing over loop variable, part two
- C # closures
- Visual C # Breaking Changes in Visual Studio 11 Beta