As you know, when using the
IEnumerable <> interface, where the collection is implied, problems can occur (see, for example,
Problems using IEnumerable and
LINQ against LSP ). Fortunately, in .NET v4.5 in 2012 (a little late, but better late than never), the interfaces
IReadOnlyCollection <> ,
IReadOnlyList <> ,
IReadOnlyDictionary <> appeared (hereinafter I will generally call them
IReadOnly- interfaces). Unlike
IEnumerable <> ,
IReadOnly interfaces allow you to sufficiently designate the collection functionality without any unnecessary requirements, which allows them to be recommended for use instead of
IEnumerable <> everywhere where reading of the collection is implied. But there is one difficulty. One of the important components that consume and create collections is
LINQ, and especially its part “LINQ to objects”. Unfortunately,
IReadOnly interfaces appeared 5 years after LINQ, and are not used in it. All input and output collections of LINQ operations have the basic
IEnumerable <> type, based on the limited capabilities of which, many operations imply unnecessary costs: complete sequential enumeration or even creating intermediate copies of the input collections. Moreover, by returning the same
IEnumerable <> from operations, LINQ requires that further use of the result again uses a full search and the creation of intermediate copies. In this regard, I have long had the thought to “make friends” LINQ with
IReadOnly interfaces.
Found on the Internet development on this topic did not satisfy me. For example, the
Linq to Collections library offers an effective replacement for only some LINQ operations for the
IReadOnlyList <> interface
only , is poorly optimized, and for some reason it forcibly adds a number of ambiguous extension methods for the basic types. In another found project on this topic,
CountableSharp , a small number of optimized LINQ operations are offered only for the
IReadOnlyCollection <> interface, in which a
decorator is returned, delegating all calls to the base
System.Linq.Enumerable and only the
Count property is calculated in advance without requiring a full busting collection.
Next, I propose to use my
Collection.LINQ library, which optimizes “LINQ to objects” by providing efficient implementation of most operations for collections that implement
IReadOnly interfaces.
First of all, I will allow myself a small digression, allowing you to better understand and use the work I have done. As many of you have probably noticed, in the series of
IReadOnly interfaces, there is clearly not enough interface for sets. Therefore,
we will create it ourselves in this obvious form:
')
public interface IReadOnlyFiniteSet<T> : IReadOnlyCollection<T> { bool Contains (T item); }
The only thing that is ambiguous here is the finiteness of the set associated with the inheritance of the
IReadOnlyCollection <> . Alternatively, it was possible to create an intermediate interface of an infinite set of
IReadOnlySet <> inherited from
IEnumerable <> . However, there is no practical need for it, since infinite sets are only of academic interest and it is difficult to imagine where they can find application. All the sets existing in the base class library already implement the methods necessary for
IReadOnlyFiniteSet <> , so there will be no problems with its implementation. Next, I will talk about optimizing LINQ operations, including for the
IReadOnlyFiniteSet <> interface, hoping that it will someday be made part of the base library.
So, we will consider what operations are implemented in
Collection.LINQ and how exactly they increase the efficiency of LINQ to objects.
We start by specifying those operations where optimization is not required. Aggregate Aggregate (), Average (), Min (), Max (), Sum () methods do not need any optimization because
IEnumerable <> is a necessary and sufficient interface for them, but they return a value, not a collection. For the same reason, they do not need to optimize the overloading of the methods All (), Any (), Count (), LongCount (), First (), FirstOrDefault (), Last (), LastOrDefault (), Single (), SingleOrDefault (), taking parameter predicate filter. The presence of a predicate dictates the need for complete exhaustive search, for which
IEnumerable <> is sufficient. Obviously, they also do not require optimization and the ToArray (), ToList (), ToDictionary (), ToLookup () methods, because they semantically imply brute-force and copy creation. Only the creation of an array in the ToArray () method can be slightly optimized by knowing in advance the number of elements in the collection.
The methods for creating the collections Empty (), Range () and Repeat () require one obvious optimization: they must not return a basic
IEnumerable <> , but a specific
IReadOnly interface.
Now about the main optimization. In operations that return a collection, the result is created in the form of a decorator, which delegates calls to members directly to the input collections, without prior iterations and creating copies of the elements. In order for such optimization to work when sequentially applying several operations, it is also important that the returned collections also be an
IReadOnly type. In LINQ to objects, a similar internal optimization is already implemented: many methods first of all check the implementation of some interfaces by the input collection, and use them for more efficient execution of actions. But, of course,
IReadOnly interfaces (which did not exist at the moment LINQ appeared) are not used, and the returned collection always has only the basic type
IEnumerable <> . In my library, optimization in the form of direct decoration is used in the following LINQ operations.
for collections of type IReadOnlyCollection <> |
Any (), Count (), LongCount () | immediately return the result using the Count property |
DefaultIfEmpty () | returns IReadOnlyCollection <>, depending on the Count property, initial or one-element |
Select (), Zip () | return IReadOnlyCollection <> - decorator |
Skip (), Take () | return IReadOnlyCollection <>, depending on the Count property, original, empty or decorator |
Concat () | returns IReadOnlyCollection <>, depending on the properties of the Count, one of the source or the decorator |
Reverse () | returns an IReadOnlyCollection <>, depending on the Count property, either the original or the decorator, which creates a complete copy of the collection when the enumeration request |
OrderBy (), OrderByDescending (), ThenBy (), ThenByDescending () | return an IReadOnlyCollection <>, depending on the Count property, either the original or the decorator, which creates an entire copy of the collection and a sorted index when an enumeration request is made |
for IReadOnlyFiniteSet <> sets (in addition to what is available for IReadOnlyCollection <> collections) |
Contains () | immediately returns the result using the Contains () method |
Distinct () | returns the original IReadOnlyFiniteSet <> set |
Except (), Intersect (), Union () | return IReadOnlyFiniteSet <>, depending on the Count property, one of the initial, empty or decorator, when creating a fully enumerating the smaller of the input sets |
DefaultIfEmpty () | returns IReadOnlyFiniteSet <>, depending on the Count property, initial or one-element |
Reverse () | returns IReadOnlyFiniteSet <>, depending on the Count property, the original or decorator that creates a complete copy of the set |
for IReadOnlyList <> lists (in addition to what is available for IReadOnlyCollection <> collections) |
ElementAt (), ElementAtOrDefault (), First (), FirstOrDefault (), Last (), LastOrDefault () | immediately return the result using the methods IReadOnlyList <> |
DefaultIfEmpty () | returns IReadOnlyList <>, depending on the Count property, initial or one-element |
Skip (), Take (), Select (), Concat (), Zip (), Reverse () | return IReadOnlyList <> - decorator |
OrderBy (), OrderByDescending (), ThenBy (), ThenByDescending () | return IReadOnlyList <>, depending on the Count property, the source or decorator that creates a sorted index to delegate the receipt of items by position number and enumeration |
Unfortunately, for some LINQ operations, the functionality of
IReadOnly collections cannot add efficiency. These are the operations that result in a collection obtained by filtering the values ​​of the elements of the input collections. There is no way without complete brute force and copying. The filtering operations SkipWhile (), TakeWhile (), Where (), as well as the conditional grouping and join methods Join (), GroupJoin (), GroupBy (), SelectMany () are not optimized.
Additionally, I will mention a few optimizations in my
Collection.Linq , not related to
IReadOnly interfaces:
- The LINQ operation SequenceEqual () is implemented as a direct delegation to the Equals () method of the IStructuralEquatable interface for collections that implement this interface (appeared 3 years later LINQ).
- Significantly reduced the number of method overloads due to the widespread use of optional parameters (appeared a year later LINQ).
- A clearly missing set operation is added - a symmetric difference in the form of the SymmetricExcept () method.
To make sure that I didn’t miss anything and did it more efficiently than the library
System.Linq.Enumerable , I constantly spied on its
sources .
Using the
Collection.LINQ library is easy: just in your code by line
using System.Linq;
add line
using BusinessClassLibrary.Collections.Linq;
It doesn't matter how you use “LINQ to objects”, in the form of query syntax or in a fluent form. After connecting my library, your LINQ code will automatically use the most optimal methods, provided that collections that implement
IReadOnly interfaces are
input . The only exception is the methods for creating collections of the class
System.Linq.Enumerable , which are not extension methods: Enumerable.Empty (), Enumerable.Range () and Enumerable.Repeat (). These methods will need to be manually replaced with ReadOnlyFiniteSet.Empty () / ReadOnlyList.Empty (), ReadOnlyFiniteSet.Range () / ReadOnlyList.Range () and ReadOnlyList.Repeat (), respectively.
The entire library is presented in a
public project on github . At the request of readers, a
nuget-package with a library was also created.