📜 ⬆️ ⬇️

Problems using IEnumerable

In this article I want to talk about the problems of using the IEnumerable interface. We will look at what problems the use of this interface can bring, when it really needs to be used, and what to replace it with.

And I wanted to start an article with a couple of code examples, or rather with a couple of bugs that I encountered in real projects.

Examples of problems



Here is the first example - the code from the real project, only the names are changed.
')
private IEnumerable<Account> GetAccountsByOrder(IEnumerable<Account> accounts, IEnumerable<OrderItem> orderItems) { var orderItemsWithQuotaOwners = _restsProvider.GetQuotaOwner(orderItems); return accounts.Where( q => orderItemsWithSourceQuotaOwners.Any(s => s.QuotaOwner == q.QuotaOwner && ... )); } 


This seemingly not complicated piece of code brought us quite a lot of trouble. It's all about the GetQuotaOwner method. Inside it, a LINQ to SQL query is executed, then a projection on LINQ to entities is built and an IEnumerable is returned. As a result, for each line quotedAccounts, we get a new implementation of the internals of the GetQuotaOwner method. Interestingly, the resharper in this case did not warn us about the danger.

This is the second example. Here, the truth is not the code of the real project, but the idea of ​​the code and the problem were taken from the real project.

 class Foo { public string Value; } class Bar { public string Value; public int ACount; } static void Main() { Foo[] foo = new[] { new Foo { Value = "Abba" }, new Foo { Value = "Deep Purple" }, new Foo { Value = "Metallica" } }; var bar = foo.Select(x => new Bar { Value = x.Value, ACount = x.Value.Count(c => c == 'a' || c == 'A') }); Censure(bar); foreach (var one in bar) { Console.WriteLine(one.Value); } } private static void Censure(IEnumerable<Bar> bar) { foreach (var one in bar) { if (one.ACount > 1) { one.Value = "<censored>"; } } } 


Here we get some data, build their projection and further censor it. And with great surprise we see that the data on the screen is not censored ...

The reason for the problem is quite simple - we iterate over the collection twice, which means we get two independent collections of instances of the class Bar.

It is clear that to fix these two pieces of code is not difficult, just add ToArray. The question is - what we fundamentally did wrong and how to work correctly with IEnumerable.

What IEnumerable abstracts


First, consider the IEnumerable as such. Without going into technical details, this interface abstracts the sequence of elements. Moreover, absolutely nothing is known about this sequence: it is finite or infinite, what is the cost of operations on it.

Here is a simple example - var lines = File.ReadLines ("data.txt");

What can we do with lines now? Well, if we do not want to kill the performance of our program, we can not iterate twice on this collection. Meaning innocent code

  var lines = File.ReadLines("data.txt"); string lastLine = lines.ElementAt(lines.Count()); 


must be taboo for us.

Could be even worse:

 class RandomStrings : IEnumerable<int> { Random _rnd = new Random(); public IEnumerator<int> GetEnumerator() { while (true) yield return _rnd.Next(); } } 


Now even one innocent solitary Count () hangs up our application.

From here follows one simple conclusion: it is very difficult to work with an IEnumerable without having the assumption that it is sitting inside.

Of course, for our example of reading a file, you can implement efficiently getting the last line (or, say, some stream processing of strings), but you need to stop thinking about IEnumerable as a collection and start writing more complex code.

But in real programs, most often, the programmer thinks of the IEnumerable as a collection. For example, even such a pattern appeared - a protective copy of IEnumerable. Those. call ToArray () at the beginning of the method, when IEnumerable comes there.

That is, we immediately say - a finite sequence came to us that easily fits in the memory. But then why do we use IEnumerable when we mean the collection?

Here, however, a picky reader may ask - what does it mean to work correctly with a collection? Collections are different - a coherent list is also a collection, and getting the last line for a coherent list in the style as it was done above is also not very effective at all (although certainly not as scary as in the case of IEnumerable, where repeated iteration over collections may be associated with a huge amount of work).

Therefore, it is worth clarifying the concepts and talking about a vector (in the .NET List, later I will call this collection a leaf) or an array.

Then we can actually program under the contract - if the IList is passed to the input of the method as working with a sheet, knowing that accessing an arbitrary element and obtaining the number of elements is O (1), and if we are IEnumerable, then we will have to sweat, implementing the correct and effective work with him.

A similar situation with the return value - returning IEnumerable, we force the user to write a much more complex code that works with the sequence, and not the sheet.

LINQ


The situation in .NET with the dominance of IEnumerable has been exacerbated with the introduction of LINQ. If previously an application provider could see this interface a couple of times in his life, now any LINQ query generates an IEnumerable.

The question arises - what to do with such IEnumerable? You can go to one extreme - immediately convert to an array or List. Such an approach has the right to life. It guarantees the absence of problems with repeated iteration. On the other hand, many unnecessary arrays can be generated, which then will have to be collected by the garbage collector.

You can take a compromise approach: work with an IEnumerable inside a method, giving out only arrays or sheets. The disadvantage of this approach is that you have to be more cautious about IEnumerable type variables (var in real sources ...), avoiding repeated iterations on them, in the event that this can adversely affect performance. Conceptually, this approach is also admissible - within one method, we may well know the nature of this particular IEnumerable instance and not try to treat it as a spherical IEnumerable in vacuum.

Select a collection type


As already mentioned, to transfer collections between ICollection methods is not the most successful type, since working effectively with an arbitrary ICollection implementation is only slightly easier than with IEnumerable.

You can choose IList, but this interface has one huge disadvantage compared to IEnumerable - it allows you to edit collections, whereas in 95% of cases the collections themselves are meant as read-only objects.

Instead of IList, you can use the good old array. True, it allows you to assign items. But, in my opinion, the most frequent operations on collections are the removal and addition of elements. While the assignment of an element by index for business applications is exotic. Therefore, it is quite possible to use arrays as a read-only collection.

Another possibility is to use ReadOnlyCollection. Just want to say that this class has not quite the right name. Its only constructor has the following public readOnlyCollection (IList list) signature. That is, it would be more correct to call it ReadOnlyList. At first glance, using this class everywhere may not be very convenient, but if you write an extension
 public ReadOnlyCollection<T> ToReadOnly(this IEnumerable<T> data) 
This may be a working option.

Well, the 4.5 framework has already solved this problem: it introduces the interface IReadOnlyCollection and IReadOnlyList. And List implements IReadOnlyList, i.e. can write

 IReadOnlyList<Foo> Do(IReadOnlyList<Bar> bar) { return bar.Where(x => IsGood(x)).ToList(); } 


findings


Ubiquitous use of IEnumerable methods in signatures violates contract programming principles and leads to errors.

For transfer between methods, you can use IEnumerable only if you actually need to work with IEnumerable, and working with ordinary collections will not be effective or impossible.

To transfer read-only collections between methods, you can use arrays, the ReadOnlyCollection class, and the IReadOnlyList interface.

PS
On the same topic there is another article on Habré.

Source: https://habr.com/ru/post/193774/


All Articles