LINQ for PHP: speed matters

If you don’t know what LINQ is and why it gave up on PHP, see the previous article on YaLinqo .

With the rest we continue. Immediately I warn you: if you think that iterators are an unnecessary thing that dragged in PHP for some reason, that performance because of all these newfangled things with anonymous functions sags so badly that you need to die out every microsecond, that nothing is better than good old things not invented - then pass by. Library and article is not for you.

With the rest we continue. LINQ is great, but how good is performance from using it? If you compare with bare cycles, then the speed is less than 3-5 times. If to compare with functions for arrays to which anonymous functions are transferred, then 2-4 times. Since it is assumed that with the help of the library small data arrays are processed, and complex data processing is outside the script (in the database, in a third-party web service), in fact, the losses in the script are small. The main thing is readability.
')
Since, since the creation of my YaLinqo library, two more competitors have appeared, which are really LINQ (that is, they support lazy calculations and other basic features), the library has a desire to compare. The simplest and most logical is to compare functionality and performance. At least it will not be a beating of babies, as in the past comparison .

(And also the emergence of competitors finally motivated me to post the YaLinqo documentation online .)

Disclaimer: these are “on the knee” tests. They do not give an estimate of all losses in performance. In particular, I absolutely do not consider memory consumption. Partly because I do not know how to do it normally. If that, pull requests are welcome , that is called.
Competitors
YaLinqo - Yet Another LINQ to Objects for PHP . Supports queries only to objects: arrays and iterators. It has two versions: for PHP 5.3+ (without yield) and for PHP 5.5+ (with yield). The latest version relies solely on yield and arrays for all operations. In addition to anonymous functions, it supports "string lambdas". The most minimalistic of the libraries presented: it contains only 4 classes. Of the features - a very massive documentation, adapted from MSDN .

Ginq - 'LINQ to Object' inspired DSL for PHP . Similarly, it only supports queries on objects. Based on SPL iterators, therefore in PHP 5.3+ requirements. In addition to anonymous functions, it supports “property access” from symfony. Medium-sized library: ported collections, comparers, key-value pairs and other stuff from .NET; total 70 classes. Documentation of average scall: at best, signatures are indicated. The main feature is iterators, which allows the library to be used both by building queries in the form of a chain of methods, and using nested iterators.

Pinq - PHP Integrated Query, a real LINQ library for PHP . The only library that allows you to work with objects and databases (well ... theoretically allows). It supports only anonymous functions, but can parse code with PHP-Parser . The documentation is not the most detailed (if any), but it has a nice site. The most massive library of the following: more than 500 classes, not counting 150 classes of tests (to be honest, I didn’t even climb into the code, because it’s scary).

All the libraries presented with tests and other signs of quality are fine. Permissive licenses: BSD, MIT. All support Composer and are presented on Packagist.

Tests

Hereinafter, an array of functions is passed to the benchmark_linq_groups function: for bare PHP, YaLinqo, Ginq and Pinq, respectively.

Tests chasing PHP 5.5.14, Windows 7 SP1. Since the tests are “on the knee”, I do not cite iron specs - the task is to evaluate the loss by eye, and not to measure everything up to a millimeter. If you want accurate tests, then the source code is available on the githab, you can improve, pull-requests are accepted.

Let's start with the bad - pure overhead projector.

 benchmark_linq_groups("Iterating over $ITER_MAX ints", 100, null, [ "for" => function () use ($ITER_MAX) { $j = null; for ($i = 0; $i < $ITER_MAX; $i++) $j = $i; return $j; }, "array functions" => function () use ($ITER_MAX) { $j = null; foreach (range(0, $ITER_MAX - 1) as $i) $j = $i; return $j; }, ], [ function () use ($ITER_MAX) { $j = null; foreach (E::range(0, $ITER_MAX) as $i) $j = $i; return $j; }, ], [ function () use ($ITER_MAX) { $j = null; foreach (G::range(0, $ITER_MAX - 1) as $i) $j = $i; return $j; }, ], [ function () use ($ITER_MAX) { $j = null; foreach (P::from(range(0, $ITER_MAX - 1)) as $i) $j = $i; return $j; }, ]);

The generating function range in Pinq is missing, the documentation says to use the standard function. What, in fact, we are doing.

And the results:

  Iterating over 1000 ints
 ------------------------
   PHP [for] 0.00006 sec x1.0 (100%)
   PHP [array functions] 0.00011 sec x1.8 (+ 83%)
   YaLinqo 0.00041 sec x6.8 (+ 583%)
   Ginq 0.00075 sec x12.5 (+ 1150%)
   Pinq 0.00169 sec x28.2 (+ 2717%)

Iterators mercilessly eat up speed.

But the terrible subsidence in speed at the last library is 30 times more noticeable. I must warn you: this library still has time to scare numbers, so it is too early to be surprised.

Now instead of a simple iteration, we will generate an array of consecutive numbers.

 benchmark_linq_groups("Generating array of $ITER_MAX integers", 100, 'consume', [ "for" => function () use ($ITER_MAX) { $a = [ ]; for ($i = 0; $i < $ITER_MAX; $i++) $a[] = $i; return $a; }, "array functions" => function () use ($ITER_MAX) { return range(0, $ITER_MAX - 1); }, ], [ function () use ($ITER_MAX) { return E::range(0, $ITER_MAX)->toArray(); }, ], [ function () use ($ITER_MAX) { return G::range(0, $ITER_MAX - 1)->toArray(); }, ], [ function () use ($ITER_MAX) { return P::from(range(0, $ITER_MAX - 1))->asArray(); }, ]);

And the results:

  Generating array of 1000 integers
 ---------------------------------
   PHP [for] 0.00025 sec x1.3 (+ 32%)
   PHP [array functions] 0.00019 sec x1.0 (100%)
   YaLinqo 0.00060 sec x3.2 (+ 216%)
   Ginq 0.00107 sec x5.6 (+ 463%)
   Pinq 0.00183 sec x9.6 (+ 863%)

Now YaLinqo loses only twice as far as the solution in the forehead on the cycle. Other libraries have worse results, but you can live.

Now let's calculate the test data: calculate orders with more than five items of the order; calculate orders that have more than two items with more than five items.

 benchmark_linq_groups("Counting values in arrays", 100, null, [ "for" => function () use ($DATA) { $numberOrders = 0; foreach ($DATA->orders as $order) { if (count($order['items']) > 5) $numberOrders++; } return $numberOrders; }, "array functions" => function () use ($DATA) { return count( array_filter( $DATA->orders, function ($order) { return count($order['items']) > 5; } ) ); }, ], [ function () use ($DATA) { return E::from($DATA->orders) ->count(function ($order) { return count($order['items']) > 5; }); }, "string lambda" => function () use ($DATA) { return E::from($DATA->orders) ->count('$o ==> count($o["items"]) > 5'); }, ], [ function () use ($DATA) { return G::from($DATA->orders) ->count(function ($order) { return count($order['items']) > 5; }); }, ], [ function () use ($DATA) { return P::from($DATA->orders) ->where(function ($order) { return count($order['items']) > 5; }) ->count(); }, ]); benchmark_linq_groups("Counting values in arrays deep", 100, null, [ "for" => function () use ($DATA) { $numberOrders = 0; foreach ($DATA->orders as $order) { $numberItems = 0; foreach ($order['items'] as $item) { if ($item['quantity'] > 5) $numberItems++; } if ($numberItems > 2) $numberOrders++; } return $numberOrders; }, "array functions" => function () use ($DATA) { return count( array_filter( $DATA->orders, function ($order) { return count( array_filter( $order['items'], function ($item) { return $item['quantity'] > 5; } ) ) > 2; }) ); }, ], [ function () use ($DATA) { return E::from($DATA->orders) ->count(function ($order) { return E::from($order['items']) ->count(function ($item) { return $item['quantity'] > 5; }) > 2; }); }, ], [ function () use ($DATA) { return G::from($DATA->orders) ->count(function ($order) { return G::from($order['items']) ->count(function ($item) { return $item['quantity'] > 5; }) > 2; }); }, ], [ function () use ($DATA) { return P::from($DATA->orders) ->where(function ($order) { return P::from($order['items']) ->where(function ($item) { return $item['quantity'] > 5; }) ->count() > 2; }) ->count(); }, ]);

Noticeably three nuances. First, the functional style on standard array functions turns the code into a fun unreadable ladder. Secondly, it is not possible to use string lambdas, because shielding the code inside the shielded code is the removal of the brain. Third, Pinq does not provide a count function that accepts a predicate, so you have to build a chain of methods. As it turns out later, this is far from the only limitation of Pinq: there are very few methods and they are very limited.

See the results:

  Counting values in arrays
 -------------------------
   PHP [for] 0.00023 sec x1.0 (100%)
   PHP [array functions] 0.00052 sec x2.3 (+ 126%)
   YaLinqo 0.00056 sec x2.4 (+ 143%)
   YaLinqo [string lambda] 0.00059 sec x2.6 (+ 157%)
   Ginq 0.00129 sec x5.6 (+ 461%)
   Pinq 0.00382 sec x16.6 (+ 1561%)

 Counting values in arrays deep
 ------------------------------
   PHP [for] 0.00064 sec x1.0 (100%)
   PHP [array functions] 0.00323 sec x5.0 (+ 405%)
   YaLinqo 0.00798 sec x12.5 (+ 1147%)
   Ginq 0.01416 sec x22.1 (+ 2113%)
   Pinq 0.04928 sec x77.0 (+ 7600%)

The results are more or less predictable, except for the frightening result of Pinq. I looked at the code. The entire collection is generated there, and then count() is called on it ... But it's still too early to be surprised!

Let's do the filtering. Just like last time, but instead of counting we generate collections.

 benchmark_linq_groups("Filtering values in arrays", 100, 'consume', [ "for" => function () use ($DATA) { $filteredOrders = [ ]; foreach ($DATA->orders as $order) { if (count($order['items']) > 5) $filteredOrders[] = $order; } return $filteredOrders; }, "array functions" => function () use ($DATA) { return array_filter( $DATA->orders, function ($order) { return count($order['items']) > 5; } ); }, ], [ function () use ($DATA) { return E::from($DATA->orders) ->where(function ($order) { return count($order['items']) > 5; }); }, "string lambda" => function () use ($DATA) { return E::from($DATA->orders) ->where('$order ==> count($order["items"]) > 5'); }, ], [ function () use ($DATA) { return G::from($DATA->orders) ->where(function ($order) { return count($order['items']) > 5; }); }, ], [ function () use ($DATA) { return P::from($DATA->orders) ->where(function ($order) { return count($order['items']) > 5; }); }, ]); benchmark_linq_groups("Filtering values in arrays deep", 100, function ($e) { consume($e, [ 'items' => null ]); }, [ "for" => function () use ($DATA) { $filteredOrders = [ ]; foreach ($DATA->orders as $order) { $filteredItems = [ ]; foreach ($order['items'] as $item) { if ($item['quantity'] > 5) $filteredItems[] = $item; } if (count($filteredItems) > 0) { $order['items'] = $filteredItems; $filteredOrders[] = [ 'id' => $order['id'], 'items' => $filteredItems, ]; } } return $filteredOrders; }, "array functions" => function () use ($DATA) { return array_filter( array_map( function ($order) { return [ 'id' => $order['id'], 'items' => array_filter( $order['items'], function ($item) { return $item['quantity'] > 5; } ) ]; }, $DATA->orders ), function ($order) { return count($order['items']) > 0; } ); }, ], [ function () use ($DATA) { return E::from($DATA->orders) ->select(function ($order) { return [ 'id' => $order['id'], 'items' => E::from($order['items']) ->where(function ($item) { return $item['quantity'] > 5; }) ->toArray() ]; }) ->where(function ($order) { return count($order['items']) > 0; }); }, "string lambda" => function () use ($DATA) { return E::from($DATA->orders) ->select(function ($order) { return [ 'id' => $order['id'], 'items' => E::from($order['items'])->where('$v["quantity"] > 5')->toArray() ]; }) ->where('count($v["items"]) > 0'); }, ], [ function () use ($DATA) { return G::from($DATA->orders) ->select(function ($order) { return [ 'id' => $order['id'], 'items' => G::from($order['items']) ->where(function ($item) { return $item['quantity'] > 5; }) ->toArray() ]; }) ->where(function ($order) { return count($order['items']) > 0; }); }, ], [ function () use ($DATA) { return P::from($DATA->orders) ->select(function ($order) { return [ 'id' => $order['id'], 'items' => P::from($order['items']) ->where(function ($item) { return $item['quantity'] > 5; }) ->asArray() ]; }) ->where(function ($order) { return count($order['items']) > 0; }); }, ]);

The code on functions for arrays is already beginning to smack. Last but not least, due to the fact that array_map and array_filter arguments in different order, as a result, it is difficult to understand what happens array_filter .

Code using queries is intentionally less optimal: objects are generated, even if they are later filtered. This is, in general, the LINQ tradition, which involves the creation of “anonymous types” along the way with intermediate results of calculations.

The results, when compared with previous tests, are quite even:

  Filtering values in arrays
 --------------------------
   PHP [for] 0.00049 sec x1.0 (100%)
   PHP [array functions] 0.00072 sec x1.5 (+ 47%)
   YaLinqo 0.00094 sec x1.9 (+ 92%)
   YaLinqo [string lambda] 0.00094 sec x1.9 (+ 92%)
   Ginq 0.00295 sec x6.0 (+ 502%)
   Pinq 0.00328 sec x6.7 (+ 569%)

 Filtering values in arrays deep
 -------------------------------
   PHP [for] 0.00514 sec x1.0 (100%)
   PHP [array functions] 0.00739 sec x1.4 (+ 44%)
   YaLinqo 0.01556 sec x3.0 (+ 203%)
   YaLinqo [string lambda] 0.01750 sec x3.4 (+ 240%)
   Ginq 0.03101 sec x6.0 (+ 503%)
   Pinq 0.05435 sec x10.6 (+ 957%)

Let's proceed to sorting:

 benchmark_linq_groups("Sorting arrays", 100, 'consume', [ function () use ($DATA) { $orderedUsers = $DATA->users; usort( $orderedUsers, function ($a, $b) { $diff = $a['rating'] - $b['rating']; if ($diff !== 0) return -$diff; $diff = strcmp($a['name'], $b['name']); if ($diff !== 0) return $diff; $diff = $a['id'] - $b['id']; return $diff; }); return $orderedUsers; }, ], [ function () use ($DATA) { return E::from($DATA->users) ->orderByDescending(function ($u) { return $u['rating']; }) ->thenBy(function ($u) { return $u['name']; }) ->thenBy(function ($u) { return $u['id']; }); }, "string lambda" => function () use ($DATA) { return E::from($DATA->users)->orderByDescending('$v["rating"]')->thenBy('$v["name"]')->thenBy('$v["id"]'); }, ], [ function () use ($DATA) { return G::from($DATA->users) ->orderByDesc(function ($u) { return $u['rating']; }) ->thenBy(function ($u) { return $u['name']; }) ->thenBy(function ($u) { return $u['id']; }); }, "property path" => function () use ($DATA) { return G::from($DATA->users)->orderByDesc('[rating]')->thenBy('[name]')->thenBy('[id]'); }, ], [ function () use ($DATA) { return P::from($DATA->users) ->orderByDescending(function ($u) { return $u['rating']; }) ->thenByAscending(function ($u) { return $u['name']; }) ->thenByAscending(function ($u) { return $u['id']; }); }, ]);

The code comparing functions for usort ugly, but, usort adjusted, you can write such functions without thinking. Sorting with LINQ looks almost perfectly clean. This is also the first time that you can take advantage of the pleasures of “access to properties” in Ginq - the code can no longer be made more beautiful.

The results are surprising:

  Sorting arrays
 --------------
   PHP 0.00037 sec x1.0 (100%)
   YaLinqo 0.00161 sec x4.4 (+ 335%)
   YaLinqo [string lambda] 0.00163 sec x4.4 (+ 341%)
   Ginq 0.00402 sec x10.9 (+ 986%)
   Ginq [property path] 0.01998 sec x54.0 (+ 5300%)
   Pinq 0.00132 sec x3.6 (+ 257%)

First of all, Pinq takes the lead, albeit slightly. Spoiler: it happened the first and last time.

Secondly, access to properties in Ginq terrifyingly sagging performance, that is, in the real code of this feature will not take advantage. Syntax is not worth the loss of speed 50 times.

Go to the fun - joints, aka the connection of two collections by key.

 benchmark_linq_groups("Joining arrays", 100, 'consume', [ function () use ($DATA) { $usersByIds = [ ]; foreach ($DATA->users as $user) $usersByIds[$user['id']][] = $user; $pairs = [ ]; foreach ($DATA->orders as $order) { $id = $order['customerId']; if (isset($usersByIds[$id])) { foreach ($usersByIds[$id] as $user) { $pairs[] = [ 'order' => $order, 'user' => $user, ]; } } } return $pairs; }, ], [ function () use ($DATA) { return E::from($DATA->orders) ->join($DATA->users, function ($o) { return $o['customerId']; }, function ($u) { return $u['id']; }, function ($o, $u) { return [ 'order' => $o, 'user' => $u, ]; }); }, "string lambda" => function () use ($DATA) { return E::from($DATA->orders) ->join($DATA->users, '$o ==> $o["customerId"]', '$u ==> $u["id"]', '($o, $u) ==> [ "order" => $o, "user" => $u, ]'); }, ], [ function () use ($DATA) { return G::from($DATA->orders) ->join($DATA->users, function ($o) { return $o['customerId']; }, function ($u) { return $u['id']; }, function ($o, $u) { return [ 'order' => $o, 'user' => $u, ]; }); }, "property path" => function () use ($DATA) { return G::from($DATA->orders) ->join($DATA->users, '[customerId]', '[id]', function ($o, $u) { return [ 'order' => $o, 'user' => $u, ]; }); }, ], [ function () use ($DATA) { return P::from($DATA->orders) ->join($DATA->users) ->onEquality( function ($o) { return $o['customerId']; }, function ($u) { return $u['id']; } ) ->to(function ($o, $u) { return [ 'order' => $o, 'user' => $u, ]; }); }, ]);

Pinq syntactically stood out, where one function is essentially divided into several calls. This is probably more readable, but for LINQ methods that are used to chaining, this syntax may be less common.

And ... the results:

  Joining arrays
 --------------
   PHP 0.00021 sec x1.0 (100%)
   YaLinqo 0.00065 sec x3.1 (+ 210%)
   YaLinqo [string lambda] 0.00070 sec x3.3 (+ 233%)
   Ginq 0.00103 sec x4.9 (+ 390%)
   Ginq [property path] 0.00200 sec x9.5 (+ 852%)
   Pinq 1.24155 sec x5,911.8 (+ 591084%)

No, there is no error. Pinq really kills speed six thousand times. At first, I thought the script was hanging, but in the end it ended, and I gave this unimaginable number. I did not find where in the Pinq source code is the code for this set of functions, but I have a feeling that there is for-for-if without arrays of dictionaries. So much for the OOP.

Consider another simple test - aggregation (or accumulation, or convolution - whatever you like):

 benchmark_linq_groups("Aggregating arrays", 100, null, [ "for" => function () use ($DATA) { $sum = 0; foreach ($DATA->products as $p) $sum += $p['quantity']; $avg = 0; foreach ($DATA->products as $p) $avg += $p['quantity']; $avg /= count($DATA->products); $min = PHP_INT_MAX; foreach ($DATA->products as $p) $min = min($min, $p['quantity']); $max = -PHP_INT_MAX; foreach ($DATA->products as $p) $max = max($max, $p['quantity']); return "$sum-$avg-$min-$max"; }, "array functions" => function () use ($DATA) { $sum = array_sum(array_map(function ($p) { return $p['quantity']; }, $DATA->products)); $avg = array_sum(array_map(function ($p) { return $p['quantity']; }, $DATA->products)) / count($DATA->products); $min = min(array_map(function ($p) { return $p['quantity']; }, $DATA->products)); $max = max(array_map(function ($p) { return $p['quantity']; }, $DATA->products)); return "$sum-$avg-$min-$max"; }, ], [ function () use ($DATA) { $sum = E::from($DATA->products)->sum(function ($p) { return $p['quantity']; }); $avg = E::from($DATA->products)->average(function ($p) { return $p['quantity']; }); $min = E::from($DATA->products)->min(function ($p) { return $p['quantity']; }); $max = E::from($DATA->products)->max(function ($p) { return $p['quantity']; }); return "$sum-$avg-$min-$max"; }, "string lambda" => function () use ($DATA) { $sum = E::from($DATA->products)->sum('$v["quantity"]'); $avg = E::from($DATA->products)->average('$v["quantity"]'); $min = E::from($DATA->products)->min('$v["quantity"]'); $max = E::from($DATA->products)->max('$v["quantity"]'); return "$sum-$avg-$min-$max"; }, ], [ function () use ($DATA) { $sum = G::from($DATA->products)->sum(function ($p) { return $p['quantity']; }); $avg = G::from($DATA->products)->average(function ($p) { return $p['quantity']; }); $min = G::from($DATA->products)->min(function ($p) { return $p['quantity']; }); $max = G::from($DATA->products)->max(function ($p) { return $p['quantity']; }); return "$sum-$avg-$min-$max"; }, "property path" => function () use ($DATA) { $sum = G::from($DATA->products)->sum('[quantity]'); $avg = G::from($DATA->products)->average('[quantity]'); $min = G::from($DATA->products)->min('[quantity]'); $max = G::from($DATA->products)->max('[quantity]'); return "$sum-$avg-$min-$max"; }, ], [ function () use ($DATA) { $sum = P::from($DATA->products)->sum(function ($p) { return $p['quantity']; }); $avg = P::from($DATA->products)->average(function ($p) { return $p['quantity']; }); $min = P::from($DATA->products)->minimum(function ($p) { return $p['quantity']; }); $max = P::from($DATA->products)->maximum(function ($p) { return $p['quantity']; }); return "$sum-$avg-$min-$max"; }, ]); benchmark_linq_groups("Aggregating arrays custom", 100, null, [ function () use ($DATA) { $mult = 1; foreach ($DATA->products as $p) $mult *= $p['quantity']; return $mult; }, ], [ function () use ($DATA) { return E::from($DATA->products)->aggregate(function ($a, $p) { return $a * $p['quantity']; }, 1); }, "string lambda" => function () use ($DATA) { return E::from($DATA->products)->aggregate('$a * $v["quantity"]', 1); }, ], [ function () use ($DATA) { return G::from($DATA->products)->aggregate(1, function ($a, $p) { return $a * $p['quantity']; }); }, ], [ function () use ($DATA) { return P::from($DATA->products) ->select(function ($p) { return $p['quantity']; }) ->aggregate(function ($a, $q) { return $a * $q; }); }, ]);

There is nothing special to explain in the first set of functions. The only thing is that I divided the calculation into separate passes in all cases.

In the second set, the product is calculated. Pinq failed again: it does not provide an overload that accepts a starting value, instead it always takes the first element (and returns null if there are no elements, and does not throw an exception ...), as a result, you have to add values further.

Results:

  Aggregating arrays
 ------------------
   PHP [for] 0.00059 sec x1.0 (100%)
   PHP [array functions] 0.00193 sec x3.3 (+ 227%)
   YaLinqo 0.00475 sec x8.1 (+ 705%)
   YaLinqo [string lambda] 0.00515 sec x8.7 (+ 773%)
   Ginq 0.00669 sec x11.3 (+ 1034%)
   Ginq [property path] 0.03955 sec x67.0 (+ 6603%)
   Pinq 0.03226 sec x54.7 (+ 5368%)

 Aggregating arrays custom
 -------------------------
   PHP 0.00007 sec x1.0 (100%)
   YaLinqo 0.00046 sec x6.6 (+ 557%)
   YaLinqo [string lambda] 0.00057 sec x8.1 (+ 714%)
   Ginq 0.00046 sec x6.6 (+ 557%)
   Pinq 0.00610 sec x87.1 (+ 8615%)

Pinq and string properties in Ginq showed awful results, YaLinqo saddened, built-in functions saddened no less. For taxis.

Well, for dessert, an example from ReadMe YaLinqo - request with all the functions combined:

 benchmark_linq_groups("Process data from ReadMe example", 5, function ($e) { consume($e, [ 'products' => null ]); }, [ function () use ($DATA) { $productsSorted = [ ]; foreach ($DATA->products as $product) { if ($product['quantity'] > 0) { if (empty($productsSorted[$product['catId']])) $productsSorted[$product['catId']] = [ ]; $productsSorted[$product['catId']][] = $product; } } foreach ($productsSorted as $catId => $products) { usort($productsSorted[$catId], function ($a, $b) { $diff = $a['quantity'] - $b['quantity']; if ($diff != 0) return -$diff; $diff = strcmp($a['name'], $b['name']); return $diff; }); } $result = [ ]; $categoriesSorted = $DATA->categories; usort($categoriesSorted, function ($a, $b) { return strcmp($a['name'], $b['name']); }); foreach ($categoriesSorted as $category) { $categoryId = $category['id']; $result[$category['id']] = [ 'name' => $category['name'], 'products' => isset($productsSorted[$categoryId]) ? $productsSorted[$categoryId] : [ ], ]; } return $result; }, ], [ function () use ($DATA) { return E::from($DATA->categories) ->orderBy(function ($cat) { return $cat['name']; }) ->groupJoin( from($DATA->products) ->where(function ($prod) { return $prod['quantity'] > 0; }) ->orderByDescending(function ($prod) { return $prod['quantity']; }) ->thenBy(function ($prod) { return $prod['name']; }), function ($cat) { return $cat['id']; }, function ($prod) { return $prod['catId']; }, function ($cat, $prods) { return array( 'name' => $cat['name'], 'products' => $prods ); } ); }, "string lambda" => function () use ($DATA) { return E::from($DATA->categories) ->orderBy('$cat ==> $cat["name"]') ->groupJoin( from($DATA->products) ->where('$prod ==> $prod["quantity"] > 0') ->orderByDescending('$prod ==> $prod["quantity"]') ->thenBy('$prod ==> $prod["name"]'), '$cat ==> $cat["id"]', '$prod ==> $prod["catId"]', '($cat, $prods) ==> [ "name" => $cat["name"], "products" => $prods ]'); }, ], [ function () use ($DATA) { return G::from($DATA->categories) ->orderBy(function ($cat) { return $cat['name']; }) ->groupJoin( G::from($DATA->products) ->where(function ($prod) { return $prod['quantity'] > 0; }) ->orderByDesc(function ($prod) { return $prod['quantity']; }) ->thenBy(function ($prod) { return $prod['name']; }), function ($cat) { return $cat['id']; }, function ($prod) { return $prod['catId']; }, function ($cat, $prods) { return array( 'name' => $cat['name'], 'products' => $prods ); } ); }, ], [ function () use ($DATA) { return P::from($DATA->categories) ->orderByAscending(function ($cat) { return $cat['name']; }) ->groupJoin( P::from($DATA->products) ->where(function ($prod) { return $prod['quantity'] > 0; }) ->orderByDescending(function ($prod) { return $prod['quantity']; }) ->thenByAscending(function ($prod) { return $prod['name']; }) ) ->onEquality( function ($cat) { return $cat['id']; }, function ($prod) { return $prod['catId']; } ) ->to(function ($cat, $prods) { return array( 'name' => $cat['name'], 'products' => $prods ); }); }, ]);

The code on PHP naked is written by common efforts here on Habré.

Results:

 Process data from ReadMe example
 --------------------------------
  PHP 0.00620 sec x1.0 (100%)
  YaLinqo 0.02840 sec x4.6 (+ 358%)
  YaLinqo [string lambda] 0.02920 sec x4.7 (+ 371%)
  Ginq 0.07720 sec x12.5 (+ 1145%)
  Pinq 2.71616 sec x438.1 (+ 43707%)

GroupJoin killed Pinq performance. The rest showed more or less expected results.

More about libraries

Since Pinq is the only library presented that can generate SQL queries using PHP, the article will be incomplete if you do not consider this possibility. Unfortunately, as it turned out, the only provider is for MySQL, while it is in the form of a “demonstration”. In fact, this feature is declared and can be implemented on the basis of Pinq, but in fact it is impossible to use it.

findings

If you need to quickly filter a hundred or two of the results obtained from a web service, the LINQ libraries are quite capable of satisfying the need.

Among libraries, the undisputed performance winner is YaLinqo. If you need to filter objects using queries, then this is the most logical choice.

Ginq may appeal to those who prefer not to use chains of methods, but nested iterators. I do not know if there are such lovers of SPL iterators.

Pinq turned out to be a monstrous library, in which some of the features are implemented disgustingly, despite the many layers of abstraction. This library has potential due to the support of queries to the database, but at the moment it remains unfulfilled.

, — PHPLinq. , ORM .

Links

YaLinqo — YaLinqo
YaLinqo Docs — YaLinqo
YaLinqo Perf — YaLinqo, Ginq, Pinq
Ginq — Ginq
Pinq — Pinq

Source: https://habr.com/ru/post/259155/

All Articles

LINQ for PHP: speed matters

More articles: