📜 ⬆️ ⬇️

Tail recursion optimization in Java

For a long time already certain things from the world of functional programming more and more penetrate into non-functional languages. It may seem strange that lambda expressions have been able to integrate into Java, but the tail recursion optimization (conversion of recursion into an equivalent cycle) has not yet been done, although it is much simpler. Why is it not?

Let's try to figure out the reasons and see what you can do with your own hands.

First of all, an example. I propose not to torment the unfortunate factorial this time and use another function:

static int add(int x, int y) { if (y == 0) return x; return add(x ^ y, (x & y) << 1); } 

This is a recursive way of adding 2 integers. It fits the definition of tail recursion: each recursive call is immediately followed by a return operation. Optimization is to recursively call not to create a new stack frame, and reuse the current one. To do this, put new parameters in place of the old ones and perform an unconditional transition to the first instruction of the method. In pseudocode, it will look like this:
')
 static int add(int x, int y) { start: { if (y == 0) return x; (x, y) = (x ^ y, (x & y) << 1); goto start; } } 

Further, various stylistic variations are possible, but in general, the code after manual optimization will become like this (it is possible and more beautiful, the verbosity here is only for clarity):

 static int add(int x, int y) { while (true) { if (y == 0) return x; int _x = x ^ y; int _y = (x & y) << 1; x = _x; y = _y; } } 

Immediately it becomes obvious minus the manual optimization of tail recursion: the code becomes worse, especially if there are more than one recursive call. I'd like to optimize automatically.

And everything would be fine, but there is one BUT: optimization breaks the semantics of program execution. This is because in Java at any point in the code information about the call stack is available. And if in the case of inline methods, JIT is still capable of resolving everything by itself, then when we replace the recursive call with GOTO we lose a lot of stack frames with information about the entry points that we should have according to the specification.

This is unpleasant and suggests that, most likely, we will not see this optimization in Java.

To continue the study, we accept the fact that several lines will disappear from stacktrace. Suppose that the gain in speed (or the beauty of the code) is more important. We define other factors that may interfere with optimization:


There are at least 2 proven ways to modify Java class bytecodes:


I chose the second option and wrote a simple Java Agent that optimizes tail recursion. He will be able to optimize only under the above conditions:

For impatient link to github: github.com/ibessonov/java-tailrec-agent .
There is a customized IDEA project, in which there are examples to play with them. And for the patient - a little explanation on the example.

The verification of access modifiers in separate explanations does not need to be made obvious by its implementation. Therefore, we omit it and proceed to consider the typical method and see what happens to it:

 static int gcd(int n, int m) { try { if (m == 0) return n; } catch (Throwable t) { // do nothing } return gcd(m, n % m); } 

After compilation, the method will look as follows (simplified for readability, some of the information is intentionally omitted):

 static gcd(II)I TRYCATCHBLOCK TryBlockStart TryBlockEnd CatchBlockStart java/lang/Throwable TryBlockStart: ILOAD 1 IFNE ElseBlock ILOAD 0 TryBlockEnd: IRETURN CatchBlockStart: ASTORE 2 //      -    ElseBlock: ILOAD 1 ILOAD 0 ILOAD 1 IREM INVOKESTATIC Main.gcd(II)I IRETURN 

Each method contains an array of try block descriptions. Each description has 4 components: the start of the block, the end of the block, the exception handler, and the exception class descriptor. This information allows us to unambiguously determine the instructions that are inside the try blocks, and not to optimize them.

Next, you need to find all INVOKE* instructions with a method descriptor that matches the method itself (in this case, the gcd(II)I descriptor of the Main class method is searched for), immediately followed by a RETURN instruction.

Found INVOKESTATIC must be converted from a call to an unconditional transition. It is known that at the time of the call all the parameters are on the stack. For static methods, everything is simple, you need to save these parameters back to local variables and make an unconditional transition to the very beginning:

 static gcd(II)I TRYCATCHBLOCK TryBlockStart TryBlockEnd CatchBlockStart java/lang/Throwable StartLabel: TryBlockStart: ILOAD 1 IFNE ElseBlock ILOAD 0 TryBlockEnd: IRETURN CatchBlockStart: ASTORE 2 ElseBlock: ILOAD 1 ILOAD 0 ILOAD 1 IREM ISTORE 1 ISTORE 0 GOTO StartLabel 

For non-static methods, one interesting problem arises: the method is invoked on an object, which, generally speaking, does not have to coincide with this . It is technically possible to find in the bytecode the place of calculation of this object and to carry out optimization only when this calculation is equal to ALOAD 0 .

I acted lazily and saved the already calculated value of the required class instead of the current this (making ASTORE 0 ). Strangely enough, this approach works, but I, due to insufficient knowledge, cannot guarantee that the JIT will behave correctly in this situation. I would like to know the answer in the comments - are there any risks here?

In addition to simple tests, I tried to connect the agent to existing applications. In Tomcat, there was not a single method in which to optimize. In IntelliJ IDEA, there were at least a dozen of such methods, but the application crashed. Given the presence of several agents and the complex logic of class loaders in IDEA, I would not dare to say what went wrong.

As a result, a tool has been written that optimizes tail recursion in all methods, where it can be done without violating semantics (with the exception of modifying the stack of calls). Of the obvious drawbacks can be noted the complexity of debugging. On the other hand, debugging can always be done while the agent is off.

Is it worth them somewhere to use - it's up to you.

Source: https://habr.com/ru/post/319282/


All Articles