Generators are cool. They make it easy to write iterators by defining functions instead of creating entire classes that implement Iterator
. Generators also help create lazy lists and endless streams. The main difference between the generator function and the usual function is that the normal one can return only once (after that its execution stops), and the generator function can produce several values during execution. In this case, in the intervals between returns, the execution of the generator is paused until the next start. Therefore, generators can be used to create lists with lazily generated values, that is, each element in the list is calculated only at the time of demand.
A vivid example of the difference between early and lazy generation is the range function, which takes the start
and end
parameters and then returns a sequence of integer values that begins with start
and ends one element before end
. In the case of a regular function, you will have to create a new list, add all the elements to it, and then return the list. With this approach, range
consumes a memory size proportional to the size of the range. And depending on your environment range(1, 10000000)
can tidy up all the memory allocated by the PHP process. And this happens because of the early creation of the entire list of elements, even before the caller returns.
Using the generator function, you can create a lazy range
that uses a constant amount of memory. This is achieved using a while loop, which passes the value of the initial parameter and then increments the value of the initial argument. When the cycle ends, the function reaches the end of its body and returns a value, thereby completing the operation of the generator. That is, it is enough for the generator function to keep track of which sequence value should be returned, and not to store all values in memory. If we continue the example, then we can create an infinite range generator function that takes only the initial argument. In this case, we will have a while loop with a predicate that is always true, so the loop never ends. This allows the calling function itself to decide how many values to read from the generator. And if the infinite generator is called 100 times, then it will generate only 100 values. If it is no longer called before the calling function is completed, the generator will pause, and eventually the garbage collector will clean it up. In other words, if you use foreach
iterate a generator that creates an infinite stream of values, the generator will iterate endlessly. The PHP documentation describes well the work of generators and an example with the range
function.
But a more practical illustration: the program needs to get 200,000 objects from the external API, extract some subset of data from each document, and then place each modified object in the repository. If everything is done immediately, you will have to first create a list of 200,000 objects, then delete the elements from the list and then modify each element before adding it to the repository. If for these stages to use tools like array_filter
and array_map
, then at each of the listed operations an intermediate list is created. By combining generators, you can create a pipeline, each stage of which is calculated in a lazy way. For example, you can use a generator that is lazy and individually picks up objects from the API. Such a generator can be used by another generator, which only creates objects that pass a certain test, and already this second generator will be used by the third generator, which creates a modified version of the object received from the API.
Since generator functions return objects that implement Iterator
, they can be used in the same place as iterators, as is the case with foreach
. Unfortunately, generators cannot be inserted anywhere an array can be used. This is true, for example, for array_map
, array_reduce
and array_filter
. It’s a pity, because I prefer these functions instead of more imperative for
and foreach
cycles. In iFixit , we use several Iterator
based alternatives instead of arrays that provide the same functionality as array_map
, array_filter
with array_reduce
.
Note: all of these examples need PHP 7.1 functionality. In previous versions of the language, the code will not work.
<?php declare(strict_types = 1); /** * Like array_map() but over an iterable, and it returns a new iterable with * mapping instead of a mapped array. The callable should take two arguments: * a value to map and its key in the stream. */ function iterator_map(callable $cb, iterable $itr): iterable { foreach ($itr as $key => $value) { yield $cb($value, $key); } }
Map converts, or “map” one list of values into another, applying a transforming function to each value of the input list and adding the result to the output list. The basis of iterator_map
is the same idea, only the function as the second argument contains the result of calling the converter function for each element of the list to be iterated.
<?php // assuming range_generator is a range function that returns a // generator and not a list. $bigList = range_generator(1, 100000000); $bigListPlusOne = iterator_map(function($num) { return $num + 1; }, $bigList); // At this point no work has been done. // $bigListPlusOne is a generator that will lazily produce // all numbers from 2 to 100000000 // (100000000 because the end of the range is exclusive)
<?php declare(strict_types = 1); /** * Like array_filter() but over an iterable, and it returns a new iterable with * filtering instead of a filtered array. The callable, if non-null, should * take arguments like iterator_map(). When the callable is null, null values * will be filtered out (NOT falsey values, just x === null). */ function iterator_filter(?callable $cb, iterable $itr): iterable { $cb = $cb ?: function($x) { return $x !== null; }; foreach ($itr as $key => $value) { $keep = $cb($value, $key); if ($keep) { yield $value; } } }
Filter takes a list and creates a new one, which includes those elements from the input list, which, when passed to the filter or predicate function, generate a “true” (truthy) value. At the core of iterator_filter
is the same idea, only the function returns only those values that generate true values when passed to the predicate. If the value does not generate a “true” value, the generator does not produce this element and goes to the next of the input iterable
until some element generates a “true” value for the predicate. Then the filter returns the value and pauses until it is called again.
<?php declare(strict_types = 1); /** * Like array_reduce() but over an iterable, and it returns a single value as * the result of calling $cb over the contents of iterable $itr. * * If $initial is not null, $initial is set to the value of the first element * of the iterable, and $cb is called with the first element as the carry value * and the second element of the array as the current value. */ function iterator_reduce(callable $cb, iterable $itr, $initial = null) { if (is_null($initial)) { $initial = $itr->current(); $itr->next(); } $carry = $initial; while ($itr->valid()) { $carry = $cb($carry, $itr->current()); $itr->next(); } return $carry; }
Reduce takes a list and an optional initial value, and returns a value that is the result of applying a reducer function to each element. The convolutional function has a parameter carry
, which contains the result of calling this function as applied to the previous element, and also has the parameter current
, which is the current element in the iterated sequence. In some languages, this operation is called folding.
iterator_reduce
slightly different from the previous two functions, since it does not have an iterable
return type iterable
. The function can create single values - numbers, boolean or lowercase. All this is useful when you need to get a list or a generator and extract from it grouped or aggregated values, such as the sum of the price
properties in the list of Product
objects.
Now let's take everything discussed together in one small example. We will retrieve data from an online storage service (let's call it Storeify). The program's task: to extract all orders of the previous day and calculate the total daily income from the sale.
In our hypothetical world, there can be from 100 to 1,000,000,000 orders per day, so we cannot just get them all from the API without being charged a huge server account that can simultaneously store orders in memory. Let's create a generator for lazy retrieving orders from the Storeify API as needed.
With the help of map
, filter
and reduce
we divide the problem into tasks in order to make it easier to understand the program and accompany it. Since we only need positions or products in each order, we use iterator_map
to return positions, as well as the flatten
function to turn the list of position lists into a single list. After that, select the iterator_filter
to filter out those positions that are not products that we analyze. Next, we take the stream of filtered products and with the help of iterator_reduce
turn their price
and quantity
fields into the total income for this product for the previous day.
<?php declare(strict_types = 1); /** * Example code for consuming data from a Store API (called Storeify) and chaining generators * together to filter, map, and eventually reduce all Order data into a * daily revenue total. */ function getOrdersFromLastDay(): iterable { $limit = 20; $requestParams = [ 'lastUpdatedAfter' => new DateTime('yesterday'), 'limit' => $limit, 'offset' => 0 ]; $orders = []; do { // Grab a batch of orders from Storeify and yield from that list. $orders = Storeify::getOrders($requestParams); yield from $orders; $requestParams['offset'] += $limit; } while (!empty($orders)); } /** * Consumes a generator that produces lists of products * and produces a new generator that yields a flat list * of products. */ function flatten(iterable $itr): iterable { foreach ($itr as $products) { yield from $products; } } /** * Assume orders have a Shipping Country and a list of Products * [ * 'ShipCountry' => 'US', * 'Products' => [ * [ 'name' => 'Pro Tech Toolkit', 'price' => 59.95, 'quantity' => 2], * [ 'name' => 'iPhone 6 Battery', 'price' => 24.99, 'quantity' => 1] * ] * ] */ $orders = getOrdersFromLastDay(); // Flatten list of orders into list of all products sold. $allProducts = flatten(iterator_map(function($order) { return $order['Products']; }, $orders)); // Only include 'Pro Tech Toolkit' purchases. $toolkitProducts = iterator_filter(function($product) { return $product['name'] === 'Pro Tech Toolkit'; }, $allProducts); // Up until this point, no work has actually been done. // $toolkitProducts is a generator can be passed around to other functions // as a lazy stream of pro tech toolkit products. // Once iterator_reduce is called, it begins winding its way through // the composed generators and actually pulling down resources from the Store API // and mapping and filtering them. $dailyToolkitRevenue = iterator_reduce(function($total, $toolkit) { return $total + ($toolkit['price'] * $toolkit['quantity']); }, $toolkitProducts, 0);
Perhaps the use of generators in functional programming can be called doping. Unfortunately, PHP is not an ideal language, but, fortunately, we have all the tools for combining generators with functional concepts like map
, filter
and reduce
.
A commentator from the r / programming branch mentioned the iter
library written by a person who implemented PHP generators. This library implements all the examples from this article and much more, so I highly recommend to feel it if you plan to use generators in your code base.
Source: https://habr.com/ru/post/351818/
All Articles