📜 ⬆️ ⬇️

Generators in action

Small introduction


Not so long ago, I decided for myself that it was time to fill a large gap in knowledge and decided to read about transitions between versions of PHP, because I realized that I was somewhere between 5.2 and 5.3, and this gap needs to be somehow eliminated. Before that, I read about namespaces, traits, etc., but did not go further than reading. And here I noticed the generators, read the documentation, one of the articles on this topic and after that the thought arose - how did they live without them before?

With this translation I want to help at least the beginners, because on php.net the documentation on the generators is in English and, in my opinion, does not properly disclose the whole idea and places of use. There is a lot of text, a bit less code, no pictures. General knowledge is required, for example, about iterators. I will not comment on the obvious code, but I will try to explain examples that are difficult to understand by virtue of my knowledge.

UPD1 : Changed vague wording, which was discussed in the comments.
UPD2 : Added a solution with a forced break.
')

Theory


I’ll say the main thing right away - the generators will in no way allow to do something new, which could not be done earlier, because there are no generators before PHP 5.5. This is just a new feature that somewhat changes the usual behavior of the language. Wherever generators are used, iterators can also be used. Now, knowing this, let's take a look at an example. Let's say we need to go through the lines in the file. In procedural style, you can do it something like this:

$f = fopen($file, 'r'); while ($line = fgets($f)) { doSomethingWithLine($line); } 


This is the usual decision, there is nothing strange here. But what if we need something more abstract? Say, generate strings from an abstract source. Yes, today it may be a file, but tomorrow we will decide that a better solution would be a database or something else.

Now we have two ways to solve this problem - we can return an array or an iterator. But returning an array has several problems: firstly, we don’t know how much memory we need (all of a sudden we have a 30 GB file?), And secondly, we may not be able to describe our source as an array at all (for example, we can return endless chunks of data and try to guess when this stream will end, if you are a client).

So, there are iterators . Our example is very simple to describe through an iterator. Moreover, PHP already has a ready class for this - SPLFileObject . But let's leave it and write something of our own.

 class FileIterator implements Iterator { protected $f; public function __construct($file) { $this->f = fopen($file, 'r'); if (!$this->f) throw new Exception(); } public function current() { return fgets($this->f); } public function key() { return ftell($this->f); } public function next() { } public function rewind() { fseek($this->f, 0); } public function valid() { return !feof($this->f); } } 


Quite easy, right? Well, not really, but already something. Although if we take a closer look at the example, we will see that we have not quite accurately described the iterator, since double calling the current () method will not give us the expected result in the form of the same value.
I (the author of the article, not a “translator”) did this specifically to show that replacing a procedure with an iterator is not always an easy task, because in real situations everything is much more complicated. Let's do the right iterator for our file.

 class FileIterator implements Iterator { protected $f; protected $data; protected $key; public function __construct($file) { $this->f = fopen($file, 'r'); if (!$this->f) throw new Exception(); } public function __destruct() { fclose($this->f); } public function current() { return $this->data; } public function key() { return $this->key; } public function next() { $this->data = fgets($this->f); $this->key++; } public function rewind() { fseek($this->f, 0); $this->data = fgets($this->f); $this->key = 0; } public function valid() { return false !== $this->data; } } 


God, there are so many things for a seemingly simple task, such as traversing a file, and the main work is still hidden inside the file functions. Now, imagine what we need to do to implement a more complex algorithm. If you continue the current approach, it may become even more difficult and understand its work will be more difficult. Let's solve our problem with the help of generators.

 function getLines($file) { $f = fopen($file, 'r'); if (!$f) throw new Exception(); while ($line = fgets($f)) { yield $line; } fclose($f); } 


Much easier! Yes, this is almost like the first example with a function, only an exception and the keyword yield appear.

So how does it work?


It is very important to understand that in the example above the return value of the function changes. This is not null, as it may seem at first glance. The presence of a yield suggests that PHP will return us a special class - the generator. The generator behaves in the same way as the iterator, because it implements it. And you can use the generator in the same way as iterators.

 foreach (getLines("someFile") as $line) { doSomethingWithLine($line); } 


The whole point here is that we can write the code in any way and just throw it away ( yield , yeldnut, yeldanut ... I don’t know how to translate more correctly when there are throwing exceptions) each time a new value when we need it. So how does it work? When we call the getLines () function, PHP will execute the code before the first meeting of the yield keyword, in which it will remember this value and return the generator. Then, there will be a call to the next () method of the generator (which is described by us or an iterator), PHP will execute the code again, only start it not from the very beginning, but starting from the last value that we safely threw and forgot about it, and again, until next yield or end of function, or return. Knowing this algorithm, now you can make a useful generator:

 function doStuff() { $last = 0; $current = 1; yield 1; while (true) { $current = $last + $current; $last = $current - $last; yield $current; } } 


Perhaps at first glance it is not entirely clear what it is, and indeed an endless cycle will ruin everything. Yes, this function will work as an infinite loop. But look closer - these are Fibonacci numbers.

It should be noted that generators are not a replacement for iterators. This is just an easy way to get them. Iterators are still a powerful tool.

Difficult example


We need to make our own ArrayObject . Instead of doing an iterator, let's do a little trick with the generator. The IteratorAggregate interface requires only one method from us — getIterator (). Since the generator returns an object that implements an iterator, we can override this method so that it returns the generator. It's simple:

 class ArrayObject implements IteratorAggregate { protected $array; public function __construct(array $array) { $this->array = $array; } public function getIterator() { foreach ($this->array as $key => $value) { yield $key => $value; } } } 


Exactly! Now we can iterate over all the properties of our array through a generator or through the usual syntax of addressing by key.

Send data back


Generators allow you to send data using the send () method. In some cases it can be very convenient. For example, when you need to make some kind of log file. Instead of writing a whole class for it, you can simply use the generators:

 function createLog($file) { $f = fopen($file, 'a'); while (true) { # ,   ; $line = yield; #  ""  send()     $line; fwrite($f, $line); } } $log = createLog($file); $log->send("First"); $log->send("Second"); $log->send("Third"); 


Pretty quick and easy. In order to complicate the task a bit, let's see an example where functions work together, transferring control among themselves using generators. We need to build a queue that receives and sends data packets. Sometimes such tasks appear when we read a binary stream and need to control the size of the package.

 function fetchBytesFromFile($file) { #   ,        $length = yield; #     $f = fopen($file, 'r'); while (!feof($f)) { #     $length = yield fread($f, $length); #    } yield false; } function processBytesInBatch(Generator $byteGenerator) { $buffer = ''; $bytesNeeded = 1000; while ($buffer .= $byteGenerator->send($bytesNeeded)) { #      // ,      list($lengthOfRecord) = unpack('N', $buffer); if (strlen($buffer) < $lengthOfRecord) { $bytesNeeded = $lengthOfRecord - strlen($buffer); continue; } yield substr($buffer, 1, $lengthOfRecord); $buffer = substr($buffer, 0, $lengthOfRecord + 1); $bytesNeeded = 1000 - strlen($buffer); } } $gen = processBytesInBatch(fetchBytesFromFile($file)); foreach ($gen as $record) { doSomethingWithRecord($record); } 


A bit difficult, but I hope you understand how it works. We divided the processing and retrieval of data of a certain size at the right time + the possibility of re-using the code remains.

Need more examples!


In general, generators can be used in many tasks. One of them is flow simulation. First, we define each stream as a generator. Then we throw out the control signal to the parent so that he can transmit the signal to work to the next thread. Let's build such a system that works with different data sources (working with non-blocking I / O). Here is an example of such a system:

 function step1() { $f = fopen("file.txt", 'r'); while ($line = fgets($f)) { processLine($line); yield true; } } function step2() { $f = fopen("file2.txt", 'r'); while ($line = fgets($f)) { processLine($line); yield true; } } function step3() { $f = fsockopen("www.example.com", 80); stream_set_blocking($f, false); $headers = "GET / HTTP/1.1\r\n"; $headers .= "Host: www.example.com\r\n"; $headers .= "Connection: Close\r\n\r\n"; fwrite($f, $headers); $body = ''; while (!feof($f)) { $body .= fread($f, 8192); yield true; } processBody($body); } // 3  (step)    -  true,    ,     function runner(array $steps) { while (true) { #   ,     foreach ($steps as $key => $step) { $step->next(); #        yield if (!$step->valid()) { # ,      ()  unset($steps[$key]); } } if (empty($steps)) return; #    -   } } runner(array(step1(), step2(), step3())); 


Conclusion


Generators are VERY powerful stuff. They allow you to greatly simplify the code. Just think, you can write a function for a range of numbers in one line of code:

 function xrange($min, $max) { for ($i = $min; $i < $max; $i++) yield $i; } 


Short and simple. Easy to read, easy to understand how it works and very productive - faster than with an iterator.

Original article - Anthony Ferrara @ blog.ircmaxell.com

A popular question arose in the comments about what to do when the generator (or rather, brute force through foreach) forcibly ends its work, for example, through break. In this case, if we are dealing with file enumeration, as in the first example, then there is a risk that fclose will never work, since the generator simply “forgets” about it. One of the most correct solutions was suggested by weirdan ( # ) - use the try {...} finally {...} construction, where in the finally block we clear open resources. This block will always work when the generator iterates through, but there is a small nuance: if the generator iteration went through completely (without break) normally, then the code after the finally block will be executed.

Briefly about generators


- Do not add new functionality to the language
- Faster *
- Resumption of generator operation occurs from the last “release” yield
- Values ​​and exceptions can be sent to the generator (via the throw () method)
- Generators are unidirectional, i.e. can't go back
- Less code in most cases, easier to understand designs

* Based on these results .
For large enumeration scales, generators are faster. Approximately 4 times faster than iterators and 40% faster than normal iteration. With a small number of elements it can be slower than the usual iteration, but still faster than iterators.

If the community approves the translation and considers it good (and most importantly, it does not approve of nonsense and does not change the essence of the code), it will be interesting to me to translate other articles sometimes.
I think it will not be superfluous to translate and put together articles published now on phpmaster about data structures.
Also I will be glad to any comments, instructions, comments about errors in the text with the code, and in the translation itself.

PS In the process of translation, the idea of ​​working the buffer was lost in one of the examples and, in order not to confuse anyone, I decided to abstain from vague comments on the code. I would be glad if someone confirms my guess and I will add a comment.

Source: https://habr.com/ru/post/189796/


All Articles