The .Net developer community stood still waiting for the release of C # 7.0 and the new features it brings. Each version of the language, which next year will be 15 years old, brought with it something new and useful. And although each feature is worth a separate mention, today I want to talk about the yield
keyword. I noticed that novice developers (and not only) avoid using it. In this article I will try to bring the advantages and disadvantages, as well as highlight cases where the use of yield
appropriate.
yield
creates an iterator and allows us not to write a separate class when we implement IEnumerable
. C # contains two expressions using yield
: yield return <expression>
and yield break
. yield
can be used in methods, operators, and properties. I will talk about methods, since yield
works the same everywhere.
By applying yield return
returns, we declare that this method returns an IEnumerable
sequence, whose elements are the results of the expressions of each yield return
. And with the return value, yield return
transfers control to the caller and continues the execution of the method after the next item is requested. The values ​​of variables inside the yield
method are stored between queries. yield break
in turn plays the role of a well-known break
used inside loops. The example below will return a sequence of numbers from 0 to 10:
private static IEnumerable<int> GetNumbers() { var number = 0; while (true) { if (number > 10) yield break; yield return number++; } }
It is important to mention that the use of yield
has several limitations that you need to be aware of. The Reset
call on the iterator throws a NotSupportedException
. We cannot use it in anonymous methods and methods containing unsafe
code. Also, a yield return
cannot be located in a try-catch
, although nothing prevents you from placing it in the try
section of a try-finally
block. yield break
can be located in the try
section of both try-catch
and try-finally
. I will not give the reasons for such restrictions, as they are described in detail by Eric Lipert here and here .
Let's see what yield
turns into after compilation. Each yield return
method is a state machine that goes from one state to another as the iterator works. Below is a simple application that displays an infinite sequence of odd numbers to the console:
internal class Program { private static void Main() { foreach (var number in GetOddNumbers()) Console.WriteLine(number); } private static IEnumerable<int> GetOddNumbers() { var previous = 0; while (true) if (++previous%2 != 0) yield return previous; } }
The compiler will generate the following code:
internal class Program { private static void Main() { IEnumerator<int> enumerator = null; try { enumerator = GetOddNumbers().GetEnumerator(); while (enumerator.MoveNext()) Console.WriteLine(enumerator.Current); } finally { if (enumerator != null) enumerator.Dispose(); } } [IteratorStateMachine(typeof(CompilerGeneratedYield))] private static IEnumerable<int> GetOddNumbers() { return new CompilerGeneratedYield(-2); } [CompilerGenerated] private sealed class CompilerGeneratedYield : IEnumerable<int>, IEnumerable, IEnumerator<int>, IDisposable, IEnumerator { private readonly int _initialThreadId; private int _current; private int _previous; private int _state; [DebuggerHidden] public CompilerGeneratedYield(int state) { _state = state; _initialThreadId = Environment.CurrentManagedThreadId; } [DebuggerHidden] IEnumerator<int> IEnumerable<int>.GetEnumerator() { CompilerGeneratedYield getOddNumbers; if ((_state == -2) && (_initialThreadId == Environment.CurrentManagedThreadId)) { _state = 0; getOddNumbers = this; } else { getOddNumbers = new CompilerGeneratedYield(0); } return getOddNumbers; } [DebuggerHidden] IEnumerator IEnumerable.GetEnumerator() { return ((IEnumerable<int>)this).GetEnumerator(); } int IEnumerator<int>.Current { [DebuggerHidden] get { return _current; } } object IEnumerator.Current { [DebuggerHidden] get { return _current; } } [DebuggerHidden] void IDisposable.Dispose() { } bool IEnumerator.MoveNext() { switch (_state) { case 0: _state = -1; _previous = 0; break; case 1: _state = -1; break; default: return false; } int num; do { num = _previous + 1; _previous = num; } while (num%2 == 0); _current = _previous; _state = 1; return true; } [DebuggerHidden] void IEnumerator.Reset() { throw new NotSupportedException(); } } }
From the example, you can see that the body of the method with yield
was replaced by the generated class. Local variables of a method turned into class fields. The class itself implements both IEnumerable
and IEnumerator
. The MoveNext
method contains the logic of the replaced method with the only difference that it is represented as a state machine. Depending on the implementation of the original method, the generated class may additionally contain an implementation of the Dispose
method.
Let's do two tests and measure performance and memory consumption. I’ll note right away that these tests are synthetic and are given only to demonstrate the work yield
in comparison with the implementation "head on." Measurements will be done using BenchmarkDotNet with the BenchmarkDotNet.Diagnostics.Windows
diagnostic module enabled. The first is to compare the speed of the method for obtaining a sequence of numbers (analogue of Enumerable.Range(start, count)
). In the first case there will be an implementation without an iterator, in the second with:
public int[] Array(int start, int count) { var numbers = new int[count]; for (var i = 0; i < count; ++i) numbers[i] = start + i; return numbers; } public int[] Iterator(int start, int count) { return IteratorInternal(start, count).ToArray(); } private IEnumerable<int> IteratorInternal(int start, int count) { for (var i = 0; i < count; ++i) yield return start + i; }
Method | Count | Start | Median | Stddev | Gen 0 | Gen 1 | Gen 2 | Bytes Allocated / Op |
---|---|---|---|---|---|---|---|---|
Array | 100 | ten | 91.19 ns | 1.25 ns | 385.01 | - | - | 169.18 |
Iterator | 100 | ten | 1,173.26 ns | 10.94 ns | 1,593.00 | - | - | 700.37 |
As can be seen from the results, the Array implementation is an order of magnitude faster and consumes 4 times less memory. An iterator and a separate call ToArray
did their job.
The second test will be more difficult. We will emulate data flow. We will first select entries with an odd key, and then with a key multiple of 3rd. As in the previous test, the first implementation will be without an iterator, the second with:
public List<Tuple<int, string>> List(int start, int count) { var odds = new List<Tuple<int, string>>(); foreach (var record in OddsArray(ReadFromDb(start, count))) if (record.Item1%3 == 0) odds.Add(record); return odds; } public List<Tuple<int, string>> Iterator(int start, int count) { return IteratorInternal(start, count).ToList(); } private IEnumerable<Tuple<int, string>> IteratorInternal(int start, int count) { foreach (var record in OddsIterator(ReadFromDb(start, count))) if (record.Item1%3 == 0) yield return record; } private IEnumerable<Tuple<int, string>> OddsIterator(IEnumerable<Tuple<int, string>> records) { foreach (var record in records) if (record.Item1%2 != 0) yield return record; } private List<Tuple<int, string>> OddsArray(IEnumerable<Tuple<int, string>> records) { var odds = new List<Tuple<int, string>>(); foreach (var record in records) if (record.Item1%2 != 0) odds.Add(record); return odds; } private IEnumerable<Tuple<int, string>> ReadFromDb(int start, int count) { for (var i = start; i < count; ++i) yield return new KeyValuePair<int, string>(start + i, RandomString()); } private static string RandomString() { return Guid.NewGuid().ToString("n"); }
Method | Count | Start | Median | Stddev | Gen 0 | Gen 1 | Gen 2 | Bytes Allocated / Op |
---|---|---|---|---|---|---|---|---|
List | 100 | ten | 43.14 us | 0.14 us | 279.04 | - | - | 4,444.14 |
Iterator | 100 | ten | 43.22 us | 0.76 us | 231.00 | - | - | 3,760.96 |
In this case, the execution speed turned out to be the same, and the memory consumption of the yield
was even lower. This is due to the fact that in the implementation with the iterator, the collection was calculated only once and we saved memory on the allocation of one List<Tuple<int, string>>
.
Taking into account all the above and the above tests, we can make a brief conclusion: the main disadvantage of yield
is the additional class iterator. If the sequence is finite and the caller does not perform complex manipulations on the elements, the iterator will be slower and will create an undesirable load on the GC. However, it is reasonable to use yield
in cases of processing long sequences, when each calculation of a collection results in the allocation of large memory arrays. The lazy nature of yield
avoids the computation of elements of a sequence that can be filtered out. This can drastically reduce memory consumption and reduce the load on the processor.
Source: https://habr.com/ru/post/311094/