LINQ query functionality in C # is rewritten in PHP. Initially, the library was conceived as a training, since such libraries already exist, but then it was decided to publish it for everyone. Copying LINQ to PHP one-to-one is impossible, since the capabilities of C # and PHP are very different. A distinctive feature of the proposed solution is the orientation to iterators, lazy lambda expressions, and the signature of LINQ methods, identical to C # as far as possible. All standard LINQ methods are naturally implemented. A description of the project's capabilities with an explanation of the reasons why exactly this solution was chosen, under the cut.

Why LINQ?
New technologies are always interesting. Why did they arise, what problems they solve, how do they solve? LINQ (Language Integrated Query), an SQL-like query language for data sequences (arrays, database responses, collections), is one such chip. For example,
')
var q = from c in db.Customers where c.Activity == 1 select new { c.CompanyName, c.ItemID, c.ItemName };
In C #, support for this syntax is built in at the language level, although in fact it is syntactic sugar, which is converted to the following form
var q = db.Customers. Where((c) => c.Activity == 1). Select((c) => { c.CompanyName, c.ItemID, c.ItemName });
Here, the
Where
and
Select
functions are defined over data sequences, but the processing logic is defined for an individual element. It is made in the spirit of functional programming - everything is a function and the result of calculating the function depends only on the input parameters. The requirement for the purity of the functions allows a priori to eliminate errors due to side effects. There are other advantages:
- Since the order of processing is not important, processing over the set can be done in parallel.
- Instead of searching LINQ locally, queries can be folded (including partially) into ordinary SQL database queries. Request externally remains the same.
- The processing logic of an individual element is isolated. It can be reused for other collections or combined with others.
It would be nice to have the same tool in PHP. It would be good, as in C #, without any extra computational overhead. And although there are such libraries, each implements this functionality differently. The reason is that the user-friendly implementation of LINQ draws on many of the features of C #, which in PHP are not in the right form and need to be imitated. And then everyone has different tastes.
Why can not copy?
Let us list what features C # at first glance is not enough to copy the library in PHP.
- weak typing . It is rather an advantage.
- no overloading methods. This is a consequence of the previous paragraph. For example, if you need to compare strings, arrays and objects, then you need to write three different functions with the names
cmpString
, cmpArray
, cmpLaptop
or put if inside one big one. Both solutions are bad. In the first case, the type information in the name "clogs" the code where these functions are used. In the second case, it is hard to expand the functionality. - no class extensions. You cannot write a method and call it as if it is a method of another class, that is, you cannot extend IEnumerable <T> by simply connecting your namespace. But in PHP there is a magic __call method, through which you can call static methods implemented elsewhere. The truth is to modify the desired class, and this is not always possible. It is also worth forgetting about the support in the IDE.
- no generators with a beautiful return yield (php <5.5). On the one hand, it is syntactic sugar. You can write a function that returns
\Iterator
, which by next()
will call an anonymous function that will calculate the value of the next element and save its state in the class attributes. But the size of the code will increase many times, and its useful share will fall. Generators appeared in version 5.5, but we still have to wait until this version becomes popular. - No lambda expressions. These are such small anonymous functions, the functionality of which is limited, but the information about the structure is preserved. It can be used to symbolically compute a function from a function or to export an expression to another language, say, SQL. In PHP, you can write an anonymous function, but you will not have information about its structure and, accordingly, there will be no export to SQL.
- No operator overload. It is also impossible to beautifully simulate lambda expressions by writing something like
$f = (Exp::x()+1) * 2
and overloading the addition and multiplication operations for the class returned by the Exp::x()
method, the heir from Closure
.
There are a lot of minor differences, such as the fact that the base classes for working with a collection in languages ​​are called differently (
IEnumerator
became
Iterator
,
IEnumerable
became
IteratorAggregate
), or that PHP has no arrays of methods, but this is easily solved.
What to do? What is done
Before starting work, I didn’t really look for other solutions to write under the impression of C #, and not from other people's implementations. The first version was written for a couple of evenings. Then I transferred all the standard methods from MSDN for a long time, cut out the extra functionality, bringing the logic in accordance with .NET. At the beginning of the year I compared the possibilities in other projects, reworked the code, published the project on github. In developing the main emphasis was placed on the following points.
Iterators have everything and everything has iterators
Instead of complex loops, copying from an array to an array, the library works with iterators. Before PHP 5.5, in order to process an iterator, you had to write a class that implements the interface
\Iterator
, and pass to it in the constructor the
\Iterator
being processed as an input parameter.
$data = new \ArrayIterator([1,2,3]); $even = new \CallbackFilterIterator($data, function ($item) { return $item % 2 == 0; } );
When reading data from the current iterator, data begins to stretch and be processed from the parent iterators. The overhead of iterators seems to be minimal.
More than 15 iterators have been implemented for typical sequence processing tasks.
- CallbackFilterIterator - element filtering
- CallbackIterator - generation of infinite sequence by function
- DistinctIterator - the issuance of unique elements
- ExceptIterator, IntersectIterator - subtraction, intersection of two sequences
- GroupingIterator - grouping by key
- IndexIterator - ordering items by key
- JoinIterator, OuterJoinIterator - strict, not strict binding of two sequences by key
- LimitIterator - output range range
- ProductIterator - cross-product multiple iterators
- ProjectionIterator - projection elements (keys)
- ReverseIterator - invert order
- SkipIterator - skipping elements while the condition is true
- TakeIterator - returning items while the condition is true
- LazyIterator is an abstract class. The iterator is built at the first reading of the element.
- VariableIterator - parent iterator can change after iterator open
Iterators can be used independently of the rest of the LINQ functionality; they are even moved to a separate namespace.
Lazy calculations work too
If you pass anonymous functions as an expression in LINQ methods, this gives the greatest execution speed, beautiful lighting and the ability to refactor in the IDE, but information about the structure of the expression is lost. It does not allow to reconstruct an expression in another programming language, say, SQL. Unlike lambda expressions. To solve this problem, many authors pass a string of valid PHP code as an “expression”. The string is computed using eval for each element of the sequence and there is a possibility that it can be reformatted into another language, say, SQL. Some come up with their own string format, for example
$x ==> $x*$x
. In this case, code highlighting and refactoring in the IDE are lost, the execution is long, the code is not cached and is not safe.
The proposed library has created a tool that allows you to easily build complex expressions. Information on the structure of the expression is not lost and can be subsequently reused. The basis is the
ExpressionBuilder
class, which in streaming mode creates a calculation tree and exports it to the reverse Polish (postfix) entry. For example, so
$exp = new ExpressionBuilder(); $exp->add(1); $exp->add('+',1); $exp->add(2); $exp->export();
Operation priorities and parentheses are supported. The
Expression
class runs over the unloaded array and, if it encounters data, then throws it onto the stack, and if it encounters an object of type
OperationInterface
, it transfers control to it. The object takes the required number of arguments from the stack, calculates the result and throws it back onto the stack. At the end of the stack there is only one value - the result. At a higher level, expressions are constructed using the
LambdaInstance
class and its
Lambda
decorator. Examples of opportunities.
- access to arguments, constants
$f = Lambda::v(); $f = function ($x) { return $x; };
- mathematical operations, comparison operations, logical operations
$f = Lambda::v()->add()->v()->mult(12)->gt(36); $f = function ($x) { return $x + $x*12 > 36; };
- parentheses
$f = Lambda::v()->mult()->begin()->c(1)->add()->v()->end(); $f = function ($x) { return $x * (1 + $x); };
- string operations
$f = Lambda::v()->like('hallo%');
- array generation
$f = Lambda::complex([ 'a'=>1, 'b'=>Lambda::v() ]); $f = function ($x) { return [ 'a' => 1, 'b' => $x ]; };
- access to object properties and methods, array elements
$f = Lambda::v()->items('car'); $f = Lambda::v()->getCar(); $f = Lambda::car; $f = function ($x) { return $x->getCar(); };
- global function call
$f = Lambda::substr(Lambda::v(), 3, 1)->eq('a'); $f = function ($x) { return substr($x,3,1) == 'a'; };
- LINQ methods for values
$f = Lambda::v()->linq()->where(Lambda::v()->gt(1)); $f = function (\Iterator $x) { return new CallbackFilterIterator($x, function ($x) { return $x > 1; }); };
Of course, when calculating lambda expressions, additional side operations are performed additionally. For the function
(x)=>x+1
, the
Lambda
calculation speed is 15 times slower than a direct function call, and the structure itself requires 3600 bytes of storage to store against 800. It is planned to analyze with a profiler to figure out how to increase speed and reduce memory.
Meet on the interface, and escorted to the implementation
All LINQ methods are taken from the standard .NET 4.5 and scattered across the appropriate interfaces (
GenerationInterface
,
FilteringInterface
, etc.) with a description from MSDN. There were a lot of files, but the additional load on parsing files should not be large, especially if caching is enabled. The method signature has remained as unchanged as possible, given the capabilities of PHP. The
IEnumerable
interface inherits all of the mentioned interfaces and the
\IteratorAggregate
. The
Linq
class implements
IEnumerable
interfaces for local iteration. In the future, you can make another
IEnumerable
implementation, which will collect the SQL query or will be a facade to Doctrine. Implemented the
following methods .
- Aggregation - aggregate, average, min, max, sum, count
- Concatenation - concat, zip
- Element - elementAt, elementAtOrDefault, first, firstOrDefault, last, lastOrDefault, single, singleOrDefault
- Equality - isEqual
- Filtering - ofType, where
- Generation - from, range, repeat
- Grouping - groupBy
- Joining - product, join, joinOuter, groupJoin
- Partitioning - skip, skipWhile, take, takeWhile
- Projection - select, selectMany, cast
- Quantifier - all, any, contains
- Set - distinct, intersect, except, union
- Sorting - orderBy, orderByDescending, thenBy, thenByDescending, reverse, order
- Others - toArray, toList, each
If you need to specify a data source in the method, this can be an array (
array
), a function (
callable
) or an iterator (
\Iterator
,
\IteratorAggregate
). Similarly, a string (
string
), a function (
callable
), an array (
array
), or a lambda expression (
\LambdaInterface
) can be passed as an expression. Below are a few examples, there are also a variety of
unit tests .
// Grouping+Sorting+Filtering+array expression $x = Linq::from($cars)->group(Lambda::v()->getCategory()->getId())->select([ 'category' => Lambda::i()->keys(), 'best' => Lambda::v()->linq() ->where(Lambda::v()->isActive()->eq(true)) ->orderBy(Lambda::v()->getPrice()) ->last() ]) // Set + LambdaInterface expression $x = Linq::from($cars)->distinct(Lambda::v()->getCategory()->getTitle()); // Set + string expression $x = Linq::from($cars)->distinct('category.title'); // Generation+callable $fibonacci = function () { $position = 0; $f2 = 0; $f1 = 1; return function () use (&$position, &$f2, &$f1) { $position++; if ($position == 1) { return $f2; } elseif ($position == 2) { return $f1; } else { $f = $f1 + $f2; $f2 = $f1; $f1 = $f; return $f; } } } $x = Linq::from($fibonacci)->take(10);
The function that returned the function, which returned the function that ...
Each LINQ method creates a
Linq
class object, to which an initializing anonymous function is passed and a link to other
Linq
objects, the iterators of which are input data for the initialization function. Since
Linq
implements the
\IteratorAggregate
, when the first item is requested, the iterators are automatically initialized up the chain.
Why all this?
Thanks to everyone who read to the end. The project was made to train the brain and hands, so any meaningful criticism is welcome at 200%. I really wanted to share the work, which I was generally pleased with. If it is really useful to anyone else, then it's generally wonderful.
All code is documented, annotated, covered in tests and published on
github under the BSD license (modified). This is a fully working library.