Again about empty transfers in C #

This post was inspired by a recent article on Habré referring to an already long-standing problem (and the advising article ) about how to check that IEnumerable is empty. However, in the original articles, the authors focused more on how to issue a check, assuming that the checks are:

public static bool IsNullOrEmpty<T>(this IEnumerable<T> items) { return items == null || !items.Any(); }

will be enough, but in my opinion, this approach is not always applicable.

The fact is that IEnumerable is, in fact, a factory for IIterator , but from the point of view of the calling code, the cost of creating an iterator is completely unpredictable. The situation is complicated by the fact that in C # for lists, and for arrays, and for IEnumerable, you can use the same foreach operator and the same LINQ methods that hide the creation of IIterator and make in the minds of many developers indistinguishable lists, arrays, and IEnumerable . However, the difference can be enormous - when calling the items.Any () innocuous code, there can be various demanding operations such as creating a connection and querying the database, calling REST Api, or just a lot of heavy calculations:

 private IEnumerable<int> GetItemsDb() { using (var connection = new SqlConnection("connection string")) { connection.Open(); using (var command = new SqlCommand("SELECT Id FROM Table")) { using (SqlDataReader reader = command.ExecuteReader()) { while (reader.Read()) { yield return reader.GetInt32(0); } } } } }

 private IEnumerable<int> GetItemsLinq() { return Enumerable .Range(0, 100) .Reverse() .Select( i => { Thread.Sleep(100); return i; }) .Where(i => i < 10); }

But what to do in case you need to know if a similar iterator will return some elements or not? For example, we need to write code that summarizes all the elements returned by GetItemsLinq () , but if there are no elements, then the code should return null :
')

 int? result = GetSum(GetItemsLinq()); ... private static int? GetSum(IEnumerable<int> items) { ... }

I think that many would implement the GetSum method as follows:

 private static int? GetSum(IEnumerable<int> items) { return items != null && items.Any() ? (int?)items.Sum() : null; }

and ... faced with situations that GetSum (GetItemsLinq ()) runs for 19.1 seconds instead of the expected ten. The fact is that for items.Any () it is necessary to sort through 91 original elements in order to understand that there is something at the output, and we spend 100 milliseconds for each element. Let's try to slightly optimize the GetSum method:

 private static int? GetSumm(IEnumerable<int> items) { var list = items as IReadOnlyCollection<int> ?? items?.ToList(); return list != null && list.Count > 0 ? (int?)list.Sum() : null; }

Now the code is executed in the expected 10 seconds, but at the cost of allocating an intermediate buffer in memory. And if the items will be much more than 10? In general, you can still optimize:

 private static int? GetSum(IEnumerable<int> items) { if (items == null) return null; using (var enumerator = items.GetEnumerator()) { if (enumerator.MoveNext()) { int result = enumerator.Current; while (enumerator.MoveNext()) { result += enumerator.Current; } return result; } return null; } }

Now the code is executed in 10 seconds and does not consume extra memory, but ... I would not like to write such code every time I need to check for the presence of elements in IEnumerable . Is it possible to reuse this code? My suggestion is to create an extension method that would take as an argument the function to which we will pass the obviously non-empty IEnumerable . This function is supposed to return the result of processing this IEnumerable . “Obviously non-empty IEnumerable” is a wrapper over an already open iterator which, when the first element is requested, returns the first element already received and then continues the enumeration:

 public static class EnumerableHelper { public static TRes ProcessIfNotEmpty<T, TRes>( this IEnumerable<T> source, Func<IEnumerable<T>, TRes> handler, Func<TRes> defaultValue) { switch (source) { case null: return defaultValue(); case IReadOnlyCollection<T> collection: return collection.Count > 0 ? handler(collection) : defaultValue(); default: using (var enumerator = new DisposeGuardWrapper<T>(source.GetEnumerator())) { if (enumerator.MoveNext()) { return handler(Continue(enumerator.Current, enumerator)); } } return defaultValue(); } } private static IEnumerable<T> Continue<T>(T first, IEnumerator<T> startedEnumerator) { yield return first; while (startedEnumerator.MoveNext()) { yield return startedEnumerator.Current; } } private class DisposeGuardWrapper<T> : IEnumerator<T> { ... } }

Full source code here

Using this auxiliary method, we can solve our problem as follows:

 int? result = GetItemsLinq().ProcessIfNotEmpty(items=> items.Sum(), () => (int?)null);

Unfortunately, this approach has one major drawback, we cannot take work with items beyond ProcessIfNotEmpty , since the iterator created to check for the presence of items will not be properly closed. To prevent such incidents, the DisposeGuardWrapper class has been created and the following code will throw an exception:

 int? result = GetItemsLinq().ProcessIfNotEmpty(items=> items, () => null).Sum();

It’s unpleasant of course, but in my opinion, it’s better than spending extra resources or writing template code every time. Maybe someone will offer the best option.

Source: https://habr.com/ru/post/349920/

All Articles

Again about empty transfers in C #

More articles: