📜 ⬆️ ⬇️

How yield works

On StackOverflow, frequently asked questions are covered in detail in the documentation. Their value is that some of them someone gives the answer, which has a much greater degree of clarity and visibility than can afford the documentation. This one is one of them.

Here is the original question:
How is the yield keyword used in Python? What does it do?

For example, I try to understand this code (**):
def _get_child_candidates(self, distance, min_dist, max_dist): if self._leftchild and distance - max_dist < self._median: yield self._leftchild if self._rightchild and distance + max_dist >= self._median: yield self._rightchild 

It is called like this:
 result, candidates = list(), [self] while candidates: node = candidates.pop() distance = node._get_dist(obj) if distance <= max_dist and distance >= min_dist: result.extend(node._values) candidates.extend(node._get_child_candidates(distance, min_dist, max_dist)) return result 

')
What happens when calling the _get_child_candidates method? Is the list returned, some item? Is he called again? When do subsequent calls stop?

** The code belongs to Jochen Schulz (jrschulz), who wrote an excellent Python library for metric spaces. Here is the link to the source: http://well-adjusted.de/~jrschulz/mspace/


And here is the answer:

Iterators


To understand what makes yield, you need to understand what generators are. Generators are preceded by iterators. When you create a list, you can read its items one by one — this is called an iteration:
 >>> mylist = [1, 2, 3] >>> for i in mylist : ... print(i) 1 2 3 

Mylist is an iterable object. When you create a list using a generator expression, you also create an iterator:
 >>> mylist = [x*x for x in range(3)] >>> for i in mylist : ... print(i) 0 1 4 

All that the “for ... in ..." construction can be applied to is an iterable object: lists, strings, files ... This is convenient, because you can read values ​​from them as much as you need - but all values ​​are stored in memory, and this is not always preferably if you have a lot of meanings.

Generators


Generators are also iterable objects, but they can be read only once. This is due to the fact that they do not store values ​​in memory, but generate them on the fly:
 >>> mygenerator = (x*x for x in range(3)) >>> for i in mygenerator : ... print(i) 0 1 4 

All the same, except that parentheses are used instead of square brackets. BUT: you cannot use the for i in mygenerator construction a second time, since the generator can be used only once: it calculates 0, then forgets about it and calculates 1, completing by calculating 4 - one after the other.

Yield


Yield is a keyword that is used roughly as return — the difference is that the function will return a generator.
 >>> def createGenerator() : ... mylist = range(3) ... for i in mylist : ... yield i*i ... >>> mygenerator = createGenerator() #   >>> print(mygenerator) # mygenerator  ! <generator object createGenerator at 0xb7555c34> >>> for i in mygenerator: ... print(i) 0 1 4 

In this case, the example is useless, but this is convenient if you know that the function will return a large set of values ​​that will only need to be read once.

To master yield, you must understand that when you call a function, the code inside the function body is not executed. The function only returns a generator object — a bit tricky :-)

Your code will be called each time for refers to a generator.

Now the hard part:

In the first run of your function, it will be executed from the beginning to the moment when it stumbles on yield - then it will return the first value from the loop. For each next call, another iteration of the cycle you have written will occur, the next value will be returned - and so on until the values ​​run out.

The generator is considered empty as soon as execution of the function code fails to yield. This can happen due to the end of the loop, or if some of the “if / else” conditions is not met.

Explanation of the code from the original question


Generator:
 #   ,     def _get_child_candidates(self, distance, min_dist, max_dist): #         -: #       #      ,    if self._leftchild and distance - max_dist < self._median: yield self._leftchild #       #      ,    if self._rightchild and distance + max_dist >= self._median: yield self._rightchild #      ,    

Call:
 #           result, candidates = list(), [self] #      (     ) while candidates: #         node = candidates.pop() #       distance = node._get_dist(obj) #      ,    if distance <= max_dist and distance >= min_dist: result.extend(node._values) #      , #       , #       <...>  candidates.extend(node._get_child_candidates(distance, min_dist, max_dist)) return result 

This code contains several smaller parts:

Usually we give him a list:
 >>> a = [1, 2] >>> b = [3, 4] >>> a.extend(b) >>> print(a) [1, 2, 3, 4] 

But in our code it takes a generator, which is good for the following reasons:

And it works, because Python doesn't care whether the argument of this method is a list or not. Python expects an object to be iterated, so it works with strings, lists, tuples, and generators! This is called duck typing and is one of the reasons why Python is so cool. But that's another story for another question ...

The reader can stop here, or read a little more about the advanced use of generators:

Generator exhaustion monitoring


 >>> class Bank(): #  ,    (ATM — Automatic Teller Machine) ... crisis = False ... def create_atm(self) : ... while not self.crisis : ... yield "$100" >>> hsbc = Bank() #   ,         >>> corner_street_atm = hsbc.create_atm() >>> print(corner_street_atm.next()) $100 >>> print(corner_street_atm.next()) $100 >>> print([corner_street_atm.next() for cash in range(5)]) ['$100', '$100', '$100', '$100', '$100'] >>> hsbc.crisis = True #  ,   ! >>> print(corner_street_atm.next()) <type 'exceptions.StopIteration'> >>> wall_street_atm = hsbc.create_atm() #       >>> print(wall_street_atm.next()) <type 'exceptions.StopIteration'> >>> hsbc.crisis = False #   ,    ,  - ... >>> print(corner_street_atm.next()) <type 'exceptions.StopIteration'> >>> brand_new_atm = hsbc.create_atm() #     ,    ! >>> for cash in brand_new_atm : ... print cash $100 $100 $100 $100 $100 $100 $100 $100 $100 ... 

This can be useful for various purposes, such as controlling access to a resource.

Your best friend Itertools


The itertools module contains special functions for working with objects to be iterated. Want to duplicate a generator? Connect two generators in series? Group values ​​of nested lists in one line? Apply map or zip without creating another list?

Just add import itertools.

Want an example? Let's look at the possible order of the finish at the races (4 horses):
 >>> horses = [1, 2, 3, 4] >>> races = itertools.permutations(horses) >>> print(races) <itertools.permutations object at 0xb754f1dc> >>> print(list(itertools.permutations(horses))) [(1, 2, 3, 4), (1, 2, 4, 3), (1, 3, 2, 4), (1, 3, 4, 2), (1, 4, 2, 3), (1, 4, 3, 2), (2, 1, 3, 4), (2, 1, 4, 3), (2, 3, 1, 4), (2, 3, 4, 1), (2, 4, 1, 3), (2, 4, 3, 1), (3, 1, 2, 4), (3, 1, 4, 2), (3, 2, 1, 4), (3, 2, 4, 1), (3, 4, 1, 2), (3, 4, 2, 1), (4, 1, 2, 3), (4, 1, 3, 2), (4, 2, 1, 3), (4, 2, 3, 1), (4, 3, 1, 2), (4, 3, 2, 1)] 


Understanding the internal mechanism of iteration


Iteration is a process involving iterated objects (implementing the __iter __ () method) and iterators (implementing __next __ ()). Objects to be iterated are any objects from which an iterator can be obtained. Iterators are objects that allow to iterate over objects to be iterated.

More information on this issue is available in the article on how the for loop works .

Source: https://habr.com/ru/post/132554/


All Articles