Tail recursion optimization in .NET and Nemerle

Recently, chiaroscuro wrote a provocative article with a yellow headline "Why should cycles die?" Her discussion took about 180 comments, but the article itself went into a minus and did not get to the main one, despite the fact that it contained a sensible idea about using recursion instead of cycles.

In this article, I will supplement his post and give examples of implementation in one of the best languages under .NET - Nemerle, and also make a holivarny statement about the advantage of .NET over Java.

First, I want to point out one plus of using recursion instead of cycles, which chiaroscuro has missed - decomposition. Considering a program in a popular imperative language (Java, C #, Pascal, C) we are dealing with a stream of commands. A means of decomposing this stream, say, for code reuse or testing, is partitioning into procedures that were previously called subprograms. The name “subprogram” contains their essence - we take the main stream of commands, cut a piece, wrap it in some workaround and the procedure is ready, but it is the same stream, only smaller. We have come to the conclusion that in the language some tools serve computation — imperative instructions — those that make up the computation flow, while others serve this flow — procedures. In the case of a FW, recursion performs the same tasks as the cycles in imperative languages, and also serves as a means of decomposition, that is, a program written in FF breaks up into separate logical parts. Immediately it is worth noting that the problem of littering the global namespace arises; but in many FWs, it is elegantly solved — a function inside its body can declare a function that is visible only to it.
')
The following is what you should pay attention to in the article - comments, in many of them the habra people asserted as one “recursion is a beautiful, elegant but resource-consuming evil”. In general, this is true, but there is a class of recursive algorithms that are comparable in terms of resource consumption with reversing this recursion into a cycle — such a recursion is called tail recursion. Recognizing it is very simple: if the function received the value of a function, did not do anything with it and returned it as its value, then we are dealing with tail recursion, in the case of recursion, of course. It is easy to understand why this happens - if no action occurs with a value, then you can release the stack from local variables before calling the second function, and you can also substitute your return address as the return address.

These were additions to that article, and now the main point: to completely abandon cycles is stupid and, at present, it only makes sense if it is academic programming whose goal is to investigate the optimization of compilers; when programming is a daily bread, then you need to use the tools that are most appropriate to the task at the moment. For the future, learning haskell until it is taught automatically to parallelize multiple processors, I will not. I will choose a language that provides me with a choice of tool with which I can quickly solve my problem. That is, the language must be hybrid and support both a functional and an object-oriented paradigm. Such a language is Nemerle.

Attentive reader may ask where is Java. The fact is that the Java virtual machine, unlike .NET, does not support tail recursion optimization. In other words, for many it may be news that it is in .NET, since the most popular programming language for it - C # - does not support it. After I learned that tail recursion optimization is supported in .NET, I had the idea to test it, which was the reason for this post.

The experimental rabbit will be the algorithm for finding the last 5 digits of the n-th Fibonacci number, and it will be tested in Nemerle, C #, and Perfect C #, so call a language that is fully compatible with C #, but has the tail recursion optimization feature. To get the compiled Perfect C # assembly, I compiled the C # code, decompiled it using ildasm, added il tail instruction before calling the function and compiled il code with ilasm. Below are the program codes in these languages:

Nemerle PerfectC#

public module Fibonacci . class public auto ansi beforefieldinit
{ Fibonacci extends [mscorlib]System. Object
public Last5(number : int ) : int {
{ .method private hidebysig static int32
def F2(counter,a,b) F2( int32 counter, int32 a, int32 b)
{ cil managed
if (counter==0) {
a .maxstack 8
else ldarg.0
F2(counter-1,b,(a+b)%100000) brtrue.s IL_0005
} ldarg.1
when (number<0) throw Exception (); ret
F2(number,1,1) ldarg.0
} ldc.i4.1
} sub
ldarg.2
ldarg.1
C# ldarg.2
add
ldc.i4 0x186a0
public class Fibonacci rem
{ tail.
static private int F2( int counter, int a, int b) call int32 Fibonacci::F2( int32 , int32 , int32 )
{ ret
if (counter == 0) return a; }
return
F2(counter - 1, b, (a + b)%100000); .method public hidebysig static int32
} Last5( int32 number) cil managed
{
static public int Last5( int number) .maxstack 8
{ ldarg.0
if (number < 0) throw new Exception (); ldc.i4.0
return F2(number, 1, 1); bge.s IL_000a
} newobj instance
} void [mscorlib]System. Exception ::.ctor()
throw
ldarg.0
ldc.i4.1
ldc.i4.1
call int32 Fibonacci::F2( int32 , int32 , int32 )
ret
}

.method public hidebysig specialname
rtspecialname instance void .ctor()
cil managed
{
.maxstack 8
ldarg.0
call instance
void [mscorlib]System. Object ::.ctor()
ret
}

And now the most interesting thing is the test results. To warm up, I decided to count the last 5 digits of the 40000th Fibonacci number: Nemerle managed in 0.58 milliseconds , PerfectC # in 2.389 milliseconds , and C # in 0.782 milliseconds . After that, I decided to try the algorithm on the 100,000th Fibonacci number and got the following results: Nemerle - 1,421 milliseconds , PerfectC # - 3,244 milliseconds , C # fell with a StackOverflowException.

The results are such that Nemerle takes the first place, and the second is divided by PerfectC # and C #, since the second falls into deep recursion, but PerfectC # does not overtake us where it falls.

UPD
The reference implementation of the last 5 digits of the 40,000 Fibanacci numbers is 0.565 milliseconds and the 100,000 Fibanacci numbers are 1.407 milliseconds.

Source: https://habr.com/ru/post/47553/

All Articles

Tail recursion optimization in .NET and Nemerle

More articles: