Three Futamura Projections and not only

Hi, habrachelovek. Today I will tell you about some fundamental things in computer science: partial calculations, three projections of Futamura and supercompilation.

1. Immediately to the code

-- , x y ()
power xy =
case y of
0 → 1
1 → x
_ → x * (power x (y - 1))

Suppose we use this function like this:
')

-- ,
rectangleArea side = power side 2

This tiny example clearly shows the most important pattern in programming: abstraction.

As in this example, most of the time we use abstractions in special, special cases, while the unused part inevitably causes overhead in the calculation.

Think of your favorite web framework, for example. But what part of its code does really useful work in your application?

Many inexperienced programmers are literally crazy about this issue and they are starting to engage in premature optimization, denial of abstractions and other stupid things.

But the use of abstractions does not necessarily entail overhead. Code that uses heavy all-powerful frameworks doesn’t have to be slower than ad- written code.

Partial computation is a powerful technique that completely levels the overhead of abstractions. The overwhelming number of optimizations that significantly increase the speed of the program are special cases of partial calculation of the program. A simple example:

var someMagicNumber = 2^32

Almost every modern compiler will be able to simplify (and simplify) this expression by rewriting it when compiling so that you do not need to calculate it again and again when you run the code:

var someMagicNumber = 4294967296

This is the simplest example of partial computing. This term is called so because with this optimization some part of the program is executed and the result is saved, and the other part remains unchanged.

Let's try to partially calculate our first example with the area of a square. Imagine yourself in the place of the compiler. The compiler knows that the power function is always calculated in this place with the y = 2 parameter. Therefore, it can be partially-computed, in other words, specialize in this parameter.

Specialization, step 1. Substitute the constant 2 instead of the free parameter y.

-- , x y ()
power2 x =
case 2 of
0 → 1
1 → x
_ → x * (power x (2 - 1))

Step 2. Calculate the case-of, discarding the absurd branches 0 == 2 and 1 == 2:

-- , x 2
power2 x = x * (power x (2 - 1))

Step 3. Calculate the expression (2 - 1):

-- , x 2
power2 x = x * (power x 1)

Step 4. Substituting y = 1 into the power function, we get:

-- , x 2
power2 x = x * x

Surely you have already made an analogy with the inlining technique - the substitution of functions familiar in C / C ++. This is also a partial calculation.

2. Projections of Futamura

Professor Yoshihiko Futamura once imagined interpreting programs in the context of partial computing. I must say, it turned out quite entertaining, if not to say that the complete roofing:

Suppose we have a certain magic function that does partial calculations (specializing program):

specialize (program, data) : specialized_program

For example, if in the specialize you shove a function the power function from the example above and the data “y = 2” we get the function power2. Pretty simple.

And there is an interpreter of a certain language, for example, the PHP interpreter written in assembly language (ho-ho).

php_eval (php_code) : data

At the input - a line with php-code, at the output - the result of the calculation. Nothing unusual either.

Attention, the question. Think what will happen if we do this:

specialize (php_eval, php_code) ?

We mix the interpreter php on the assembler and the string with the php-code. As a result, an asm program is obtained that does the same thing as a php program. And this means that we compiled our php code in asm!

So, the first projection of Futamura : compilation is a specialization of the interpreter by the code of the interpreted program.

compiled_php_code = specialize (php_eval, php_code)

Funny is not it?

Further more. What happens if we do:

specialize (specialize, php_eval) ?

We mix specializer and interpreter php in assembler. It turns out - a program that can be submitted to the input of any php-code and it will turn it into an assembly code.

It turns out that with only a PHP interpreter and a magic specializer, we were able to give birth to the PHP compiler!

This is the second projection of Futamura : the compiler is the specialization of the specializer by the interpreter code.

php_compiler (php_code) = (specialize (specialize, php_eval)) (php_code)

The head already starts to hurt a little, but what is the result - the compiler was made from the interpreter! And this is not all ... Let's do this:

specialize (specialize, specialize)

OMFG, what is it? This is a compiler generator. We give any interpreter to the input, we get a compiler at the output.

make_compiler (interpreter) = (specialize (specialize, specialize)) (interpreter)

This is the third projection of Futamura .

3. Supercompilation, metacomputation

Partial calculations can reduce the code by substituting the previously known data into the program. But there is an even more advanced technique that can optimize the code much stronger, and partial calculations are a subset of it - supercompilation .

A compiler that mathematically models the execution of a program and then uses this model to produce a more efficient program is called a supercompiler (eng. Supervising compiler ).

Interestingly, the authorship of this term (and related research) belongs to our compatriot - Valentine Turchin . He also developed the Refal language, which demonstrates the concept of supercompilation (the language is very old and based on the rewriting of terms).

Supercompilation is also called "abstract interpretation" of programs. In Turchin, this is denoted by the term “running” - or “driving” in English-language publications.

When running, the compiler builds a program's process tree — a graph of all possible program states with transition edges between them.

The compiler is trying to make sense of how the program works. I will try to illustrate with a simple example:

frobnicate X =

case X of

true → case X of
true → garply
false → xyzzy

false → case X of
true → plugh
false → garply

Do not find anything strange in this code? For any value of X, the algorithm will never reach either xyzzy or plugh . Mentally scroll the algorithm and you will see it. Think about how you came to this conclusion - after all, the supercompiler works the same way.

Build a process tree:

X
→ true -- X == true

case true of
true → garply
false → xyzzy

→ false -- X == false

case false of
true → plugh
false → garply

Partially calculate case-of branches:

X
→ true → garply
→ false → garply

Consequently:

frobnicate X = garply

There have been attempts to implement this technique for Java:
www.supercompilers.com/white_paper.shtml

And, recently, for Haskell (Supero):
www-users.cs.york.ac.uk/~ndm/supero/

Probably, some readers have already imagined a certain compiler, which, using partial calculations and supercompilation, fully optimizes any program. But to solve this problem is impossible - it belongs to the class of unsolvable problems .

In reality, the process tree of a program becomes infinite or too large for intelligent processing. This happens when trying to analyze even small programs, let alone large commercial applications in tens of millions of lines of code.

There are various techniques for circumventing this problem: generalization — a generalization of infinite trees; folding (code sharing) —a generalization of the same branches of computation.

There are also many problem areas in the implementation of these techniques - for example, it is not always easy to determine when it is worth summarizing and when not. As they say, the devil is in the details.

4. Programming by inversion

Supercompilation of programs gives interesting side effects - with its help you can solve logic programming problems (remember Prolog ?), Prove theorems and invert calculations .

Possessing a set of inputs and outputs of a function and an abstract calculation graph (process tree), it is possible in some particular cases to invert the calculation.

Suppose we have the following task: to find in the string “abcd” all substrings of 2 characters in length.

We have a function (strstr ab) that returns true if b contains a. Then you could write the solution using inversion like this:

>> [x | where (strstr x "abcd"), (length x) == 2]
["ab", "bc", "cd"]

At once there are associations with the pattern matching technique in functional programming languages. Actually, it is - inversion of algorithms is the key to abstract pattern matching.

Instead of conclusion

Here is an article turned out, habrachelovek. I hope she helped you look at a different angle on the interpretation, optimization and compilation of programs.

PS Learn You a Haskell for Great Good!

Source: https://habr.com/ru/post/47418/

All Articles