What is the use of keyword yield
in Python? What are you doing?
I am trying to understand the following code 1 :
def _get_child_candidates(self, distance, min_dist, max_dist):
if self._leftchild and distance - max_dist < self._median:
yield self._leftchild
if self._rightchild and distance + max_dist >= self._median:
yield self._rightchild
The function call (or method in this case) is:
result, candidates = [], [self]
while candidates:
node = candidates.pop()
distance = node._get_dist(obj)
if distance <= max_dist and distance >= min_dist:
result.extend(node._values)
candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))
return result
What happens when the _get_child_candidates method is called? Does it return a list or a single element? Is it called again? When will subsequent calls stop?
1. The code comes from Jochen Schulz (jrschulz), who created an excellent Python library for metric spaces. This is the link to the full source: Module mspace .
Simple (and false) explanation
A
yield
can be understood as a kind ofreturn
that does not really return at all.When executing a function it finds
yield loquesea
, the function returns the valueloquesea
, but "saves where it was going". It can then resume at the point where it left off, and will continue executing for the statement following theyield
.It is therefore common to find
yield
within a loop, so that each time the function "resumes" a new value of the loop is iterated and that new value is "returned".When the function reaches its end, a "true" occurs
return
(returningNone
) and then it can no longer be resumed.True explanation (and not so complicated)
When python is compiling the program (yes, Python compiles all the code to an intermediate binary representation before proceeding to execute it) if it finds a function (let's call it for example
foo
) that contains one or more statementsyield
, it marks it in a special way. That function will be a generator .When the code calls the function
foo()
, it does not execute normally. In fact, its execution does not even begin. The interpreter makes the result offoo()
be an iterator . That is, for example, in:v
would happen to contain an iterator . An iterator is an object that can be done tonext()
.When
next(v)
then yes is done , the codefoo()
will start executing, pausing when it reaches the statementyield
. For example:would start the code of
foo
, and before the firstyield
the stream would return, and the value returned bynext()
would be precisely what youyield
specified.Later it can be done again
next(v)
and then the function willfoo
resume execution where it was (after theyield
) and continue until it finds anotheryield
, and so on.When
next(v)
the execution of the function reaches its "real" end and therefore returnsNone
, the interpreter will throw the exceptionStopIteration
.Therefore, if we want to iterate through all the values that this function "returns" until they are finished, we would do:
But Python has a much more elegant syntax for doing the same thing, which we're well used to even if we didn't know how it did it "under the hood". It's about the loop
for
. The above is equivalent to:for
takes care "underneath" of picking up the iterator returned byfoo()
and goingnext()
over it, assigning each value tor
until it is detectedStopIteration
and the loop is exited.The one
for
with list comprehensions also works with generators, so you can do something like:and so a list with all the values returned by the generator will be created. We could also shorten it to:
since it
list
expects an iterable as a parameter, so it will internally iterate over it, and it makes no difference whether it is a generator or a true list or other type of object, as long as it implements the iterable interface, that is, it admits that it can be donenext()
.Generators are preferable to lists when the number of elements is large, since they are created as they are needed, instead of having all the elements created beforehand. A generator might even never end. For example, you could think of a generator that every time you do
next()
it returns the next prime.What are they used for
As I said, the usual use is to make generators that can then be iterated in a for loop.
But for a while (before Python got
async
/ keywordsawait
, which it did in 3.5), they were also used as a slightly convoluted way of achieving asynchronous programming via cooperative tasks.Since it
yield
doesn't return at all but can continue where it left off, it can be used to implement the concept of a coroutine that allows several functions to intermingle their executions, producing the illusion that they both proceed more or less at the same time, without the need for programming. multithreading, enabling asynchronous programming.But that is another story. In your code it is being used (although it doesn't look like it) as a generator.
This is because the function call appears as a parameter of
extend
. That is, the code does the equivalent of:Since it
extend
expects an iterable as a parameter, what it will do is iterate through the values returned by the generatorfoo
, to add them tolista
. That is, it is equivalent to having done:either
Answering your questions now:
An iterable is created, and that's what it returns.
Neither one nor the other, returns an iterable. But depending on the context you use it in, that iterable will probably be iterated over. In your case, as I explained, you are passing the iterable as a parameter
extend()
so it will callnext()
on the iterable until it is exhausted. Each time you call it it runs until the nextyield
.If we look at the code in
_get_child_candidates
, we see that it contains no loops and there are only twoyield
so iterating over this case will produce only two iterations at most (if bothif
were true, otherwise it can even iterate zero times)At the risk of being cumbersome, and although it should be clear by now, the function is only called once. But it resumes several times, until it doesn't execute anymore
yield
and comes to an end. In this case, it can generate 0, 1 or 2 results depending on whether or not theif
So the list made over will
.extend()
be extended with the left child, or the right child, or both, or neither :-)no yield
To close, the function
_geg_child_candidates()
could very well have been implemented without usingyield
. You could create a list containing the left child, or the right child, or both, or neither, as appropriate, and return that list. So:From the outside, in this case, it would be used exactly the same way:
.extend()
expects something to iterate over, and in this case you'd be getting a list, whatever it's worth. It would add tocandidates
the elements of the list returned bynode._get_child_candidates()
. In the original case, you received an iterable, which also worked for you.I don't see much advantage in this case in using a generator, because the list to create is really very small. In cases where the list can be huge, the generator has the advantage already mentioned that it is lighter in memory usage, and if each element takes some time to be calculated, the generator allows you to get the next element as soon as it is ready. ready, without having to wait to generate a whole list of them. This method is what is called " lazy " (lazy), because a data is not computed until it is needed.