⬆️ ⬇️

Tail recursion elimination

I recently wrote a post on “ The origins of Python's functional functionalities ” on my Python History blog. The mention that Python does not support tail recursion (TRE) immediately provoked several comments about how unfortunate that Python does not support this optimization. There are links to recent posts in other blogs that TRE can be added to Python easily. Let me defend my position - I don’t want to add TRE to the language. If you want a short answer, it's just unpythonic.



Here is the long answer.



First , as someone noted in the comments, TRE is incompatible with normal stack tracing: when tail recursion is eliminated, there is no stack frame left to output a traceback if something goes wrong. This will confuse users who inadvertently wrote something recursive (recursion is not obvious in the stack trace), it will be difficult to debug. Providing the ability to disable TRE seems to me wrong: Python should always be as easy as possible to debug. This leads me to the following conclusion:



Secondly , the idea that TRE is just an optimization that a single Python implementation can implement or not is wrong. As soon as tail recursion elimination is introduced, developers will begin to write code that depends on this optimization, and their code will not work on implementations that do not support it: a typical Python implementation allows you to do 1000 recursions that are sufficient for non-recursively written code and , which is recursively called to bypass, for example, a tree, but not enough for a recursively written loop around a large list.

')

Third , I do not believe in recursion as the basis of all programming. This is the fundamental belief of certain programmers, especially those who love Scheme and love to learn programming, starting with recursion. But I believe that recursion is only a good theoretical approach to fundamental mathematics, but not a daily tool for solving problems.



In practice, Python-style lists (which are arrays with dynamic lengths, not coherent lists), and sequences in general, are much more useful to start exploring the wonderful world of programming than recursion. They are one of the most important tools for experienced Python programmers. Using a coherent list to represent a sequence of values ​​is unpythonic, and in most cases very inefficient. Most of the Python library is written using sequences and iterators as the main blocks of the program (and dictionaries, of course), rather than linked lists. Thus, you will deprive yourself of a part of the functionality put into the language if you do not use lists and sequences.



Finally, let's look at how we could implement tail recursion elimination. The first observation is that we cannot do this at compile time. I saw at least one blog entry that used byte code breaking to replace the CALL directly RETURN with a jump to the top of the function. This may be a good demo, but unfortunately the Python compiler cannot reliably determine whether any particular call refers to the current function, even if it has the same name. Consider this simple example:

def f(x): if x > 0: return f(x-1) return 0 


You could replace the body with something like this:

 if x > 0: x = x-1 <jump to top> return 0 


It seems simple enough, but now add this:

 g = f def f(x): return x g(5) 


A call to g (5) calls a previously defined function f, but a “recursive” call is no longer such! At run time, the name 'f' is redefined to a non-recursive function, so the return value is 4, not 0. At the same time, I agree that this is a bad style, but it is a well-defined part of the Python semantics, which has many logical ways use, and the compiler made this replacement in the hope that the definition of f would remain unchanged, would present quite a lot of errors in the real code.



Another blog post concerned decorators, which can be used to implement tail recursion using magic exceptions or return values. They can be written in plain Python code (although that message shows an optimized version of Cython, which is said to be “only 10% slower.” Although it does not seem to be oriented towards multi-threaded execution). If you are so interested, I will not try to stop you, but I strongly object to the inclusion of something like this in the built-in capabilities of the language: there are many reasons not to use a sophisticated decorator, as it assumes that any recursive call is tail recursion and can be eliminated. In the hands of less experienced users, this will lead to disasters. For example, the usual recursive definition of factorial is not tail recursion:

 def fact(n): if n > 1: return n * fact(n-1) return 1 


There are also many functions that contain tail recursion along with the usual recursive call; decorators do not support such cases. Another subtlety that such decorators do not handle is tail recursive calls in try blocks: it may seem that they can be eliminated, but this cannot be done, because TRE can remove exception handling. For all these reasons, I think that this approach is doomed, at least for a wide audience.



However, if someone is configured to add TRE to Cpython (for example), you can change the compiler like this. First, determine the “safe” tail recursion points. For example, the CALL opcode, immediately followed by the RETURN opcode, and completely outside the try blocks. (Note: I do not mention the various CALL_ * operations that are fairly easy to process using the same approach.) Then, replace each such CALL-RETURN pair with a single CALL_RETURN operation. There is no need for the compiler to check whether the name of the called function is the same as the current function: if a redefinition occurred at run time, this CALL is not applicable to TRE and you must perform the usual actions for a CALL followed by a RETURN code. (I assume that you can add some caching mechanism indexed by the call point to speed up the override).



To determine where TRE can be used, there are several levels of “aggressiveness” that you could apply. The least aggressive, “vanilla” approach is to optimize only the call where the called object is a function that is already running in the current stack frame. All we have to do in this case is to clear local variables from the current stack frame (and other hidden states like active loops), set the arguments, and go to the top. (Subtlety: the new parameters are, by definition, in the current stack frame. This is probably just a copy issue. More questions are named arguments, variable length argument lists, and default values. This is just a programming issue).



A more aggressive version would also recognize a situation where the method with tail recursion (i.e., the object being called is an associated method, where the underlying function is the same as in the current stack frame). This requires a bit more programming. The CPython interpreter code (ceval.c) already has an optimization for method calls. (I don’t know how useful it would be, but: I expect the TRE style to be liked by programmers who like to use the functional programming style as a whole, and probably don’t use classes at all. :-)



In theory, you could even optimize all cases where the callee is a function or method written in Python, and the number of local variables needed for a new call can be placed in the current stack frame. (The frame objects in CPython are located in the heap and have a variable size based on the required memory for local variables; it is a matter of architecture to use the frame objects again). These actions would optimize mutually recursive tail recursions that would otherwise not be optimized. Alas, it would also disable stack tracing in most cases, so this would not be a good idea.



A softer type would have to create stack-level objects at the Python level, just as before, but to use the C frame again. This created something like Stackless Python, although it is still quite easy to exhaust the C stack, I share recursive calls through the built-in function or method.



Of course, no approach satisfies my three claims. Is it really so hard to rewrite your function to use a loop?

Source: https://habr.com/ru/post/111768/



All Articles