📜 ⬆️ ⬇️

Porting C # LINQ to PHP

LINQ query functionality in C # is rewritten in PHP. Initially, the library was conceived as a training, since such libraries already exist, but then it was decided to publish it for everyone. Copying LINQ to PHP one-to-one is impossible, since the capabilities of C # and PHP are very different. A distinctive feature of the proposed solution is the orientation to iterators, lazy lambda expressions, and the signature of LINQ methods, identical to C # as far as possible. All standard LINQ methods are naturally implemented. A description of the project's capabilities with an explanation of the reasons why exactly this solution was chosen, under the cut.

image

Why LINQ?

New technologies are always interesting. Why did they arise, what problems they solve, how do they solve? LINQ (Language Integrated Query), an SQL-like query language for data sequences (arrays, database responses, collections), is one such chip. For example,
')
var q = from c in db.Customers where c.Activity == 1 select new { c.CompanyName, c.ItemID, c.ItemName }; 

In C #, support for this syntax is built in at the language level, although in fact it is syntactic sugar, which is converted to the following form

 var q = db.Customers. Where((c) => c.Activity == 1). Select((c) => { c.CompanyName, c.ItemID, c.ItemName }); 

Here, the Where and Select functions are defined over data sequences, but the processing logic is defined for an individual element. It is made in the spirit of functional programming - everything is a function and the result of calculating the function depends only on the input parameters. The requirement for the purity of the functions allows a priori to eliminate errors due to side effects. There are other advantages:
It would be nice to have the same tool in PHP. It would be good, as in C #, without any extra computational overhead. And although there are such libraries, each implements this functionality differently. The reason is that the user-friendly implementation of LINQ draws on many of the features of C #, which in PHP are not in the right form and need to be imitated. And then everyone has different tastes.

Why can not copy?

Let us list what features C # at first glance is not enough to copy the library in PHP.
There are a lot of minor differences, such as the fact that the base classes for working with a collection in languages ​​are called differently ( IEnumerator became Iterator , IEnumerable became IteratorAggregate ), or that PHP has no arrays of methods, but this is easily solved.

What to do? What is done

Before starting work, I didn’t really look for other solutions to write under the impression of C #, and not from other people's implementations. The first version was written for a couple of evenings. Then I transferred all the standard methods from MSDN for a long time, cut out the extra functionality, bringing the logic in accordance with .NET. At the beginning of the year I compared the possibilities in other projects, reworked the code, published the project on github. In developing the main emphasis was placed on the following points.

Iterators have everything and everything has iterators


Instead of complex loops, copying from an array to an array, the library works with iterators. Before PHP 5.5, in order to process an iterator, you had to write a class that implements the interface \Iterator , and pass to it in the constructor the \Iterator being processed as an input parameter.

 $data = new \ArrayIterator([1,2,3]); $even = new \CallbackFilterIterator($data, function ($item) { return $item % 2 == 0; } ); 

When reading data from the current iterator, data begins to stretch and be processed from the parent iterators. The overhead of iterators seems to be minimal. More than 15 iterators have been implemented for typical sequence processing tasks.
Iterators can be used independently of the rest of the LINQ functionality; they are even moved to a separate namespace.

Lazy calculations work too


If you pass anonymous functions as an expression in LINQ methods, this gives the greatest execution speed, beautiful lighting and the ability to refactor in the IDE, but information about the structure of the expression is lost. It does not allow to reconstruct an expression in another programming language, say, SQL. Unlike lambda expressions. To solve this problem, many authors pass a string of valid PHP code as an “expression”. The string is computed using eval for each element of the sequence and there is a possibility that it can be reformatted into another language, say, SQL. Some come up with their own string format, for example $x ==> $x*$x . In this case, code highlighting and refactoring in the IDE are lost, the execution is long, the code is not cached and is not safe.

The proposed library has created a tool that allows you to easily build complex expressions. Information on the structure of the expression is not lost and can be subsequently reused. The basis is the ExpressionBuilder class, which in streaming mode creates a calculation tree and exports it to the reverse Polish (postfix) entry. For example, so

 $exp = new ExpressionBuilder(); $exp->add(1); $exp->add('+',1); $exp->add(2); $exp->export(); // [1, 2, 2, '+'] 

Operation priorities and parentheses are supported. The Expression class runs over the unloaded array and, if it encounters data, then throws it onto the stack, and if it encounters an object of type OperationInterface , it transfers control to it. The object takes the required number of arguments from the stack, calculates the result and throws it back onto the stack. At the end of the stack there is only one value - the result. At a higher level, expressions are constructed using the LambdaInstance class and its Lambda decorator. Examples of opportunities.
  1. access to arguments, constants
     /*   */ $f = Lambda::v(); $f = function ($x) { return $x; }; 
  2. mathematical operations, comparison operations, logical operations
     $f = Lambda::v()->add()->v()->mult(12)->gt(36); $f = function ($x) { return $x + $x*12 > 36; }; 
  3. parentheses
     $f = Lambda::v()->mult()->begin()->c(1)->add()->v()->end(); $f = function ($x) { return $x * (1 + $x); }; 
  4. string operations
     $f = Lambda::v()->like('hallo%'); 
  5. array generation
     $f = Lambda::complex([ 'a'=>1, 'b'=>Lambda::v() ]); $f = function ($x) { return [ 'a' => 1, 'b' => $x ]; }; 
  6. access to object properties and methods, array elements
     $f = Lambda::v()->items('car'); $f = Lambda::v()->getCar(); $f = Lambda::car; $f = function ($x) { return $x->getCar(); }; 
  7. global function call
     $f = Lambda::substr(Lambda::v(), 3, 1)->eq('a'); $f = function ($x) { return substr($x,3,1) == 'a'; }; 
  8. LINQ methods for values
     $f = Lambda::v()->linq()->where(Lambda::v()->gt(1)); $f = function (\Iterator $x) { return new CallbackFilterIterator($x, function ($x) { return $x > 1; }); }; 
Of course, when calculating lambda expressions, additional side operations are performed additionally. For the function (x)=>x+1 , the Lambda calculation speed is 15 times slower than a direct function call, and the structure itself requires 3600 bytes of storage to store against 800. It is planned to analyze with a profiler to figure out how to increase speed and reduce memory.

Meet on the interface, and escorted to the implementation


All LINQ methods are taken from the standard .NET 4.5 and scattered across the appropriate interfaces ( GenerationInterface , FilteringInterface , etc.) with a description from MSDN. There were a lot of files, but the additional load on parsing files should not be large, especially if caching is enabled. The method signature has remained as unchanged as possible, given the capabilities of PHP. The IEnumerable interface inherits all of the mentioned interfaces and the \IteratorAggregate . The Linq class implements IEnumerable interfaces for local iteration. In the future, you can make another IEnumerable implementation, which will collect the SQL query or will be a facade to Doctrine. Implemented the following methods .
If you need to specify a data source in the method, this can be an array ( array ), a function ( callable ) or an iterator ( \Iterator , \IteratorAggregate ). Similarly, a string ( string ), a function ( callable ), an array ( array ), or a lambda expression ( \LambdaInterface ) can be passed as an expression. Below are a few examples, there are also a variety of unit tests .
 // Grouping+Sorting+Filtering+array expression $x = Linq::from($cars)->group(Lambda::v()->getCategory()->getId())->select([ 'category' => Lambda::i()->keys(), 'best' => Lambda::v()->linq() ->where(Lambda::v()->isActive()->eq(true)) ->orderBy(Lambda::v()->getPrice()) ->last() ]) // Set + LambdaInterface expression $x = Linq::from($cars)->distinct(Lambda::v()->getCategory()->getTitle()); // Set + string expression $x = Linq::from($cars)->distinct('category.title'); // Generation+callable $fibonacci = function () { $position = 0; $f2 = 0; $f1 = 1; return function () use (&$position, &$f2, &$f1) { $position++; if ($position == 1) { return $f2; } elseif ($position == 2) { return $f1; } else { $f = $f1 + $f2; $f2 = $f1; $f1 = $f; return $f; } } } $x = Linq::from($fibonacci)->take(10); 

The function that returned the function, which returned the function that ...


Each LINQ method creates a Linq class object, to which an initializing anonymous function is passed and a link to other Linq objects, the iterators of which are input data for the initialization function. Since Linq implements the \IteratorAggregate , when the first item is requested, the iterators are automatically initialized up the chain.

Why all this?

Thanks to everyone who read to the end. The project was made to train the brain and hands, so any meaningful criticism is welcome at 200%. I really wanted to share the work, which I was generally pleased with. If it is really useful to anyone else, then it's generally wonderful.

All code is documented, annotated, covered in tests and published on github under the BSD license (modified). This is a fully working library.

Source: https://habr.com/ru/post/209514/


All Articles