Interpretation at compile time, or Alternative understanding of lambda in C ++ 11

Yo dawg, I heard you like programming. So you can program while you program

On Habré , another article about computing on C ++ templates from HurrTheDurr recently skipped . In the comments to her personally, I saw the challenge:

> With each new release, the number of ways to nontrivially dislocate your brain with C ++ continues to increase)
Quoted1>> Especially if you do not change the approach to the implementation of the playing field and continue to try to perform all calculations not on constants, but on types.

Is it so difficult to write a universal calculator on types, more convenient for programming than a cellular automaton? As it turned out, easy; I spent 30 times more time on this article than on writing and debugging the actual code of the calculator.
')
AveNat slightly earlier published an introduction to lambda calculus in two parts , so the inspiration came instantly. I wanted to be able to (figuratively) write like this:

#include <iostream> #include <LC/kernel.h> #include <LC/church_numerals.h> int main() { //      - typedef ChurchEncode<2> Two; // 2 = λfx.f (fx) typedef ChurchEncode<3> Three; // 3 = λfx.f (f (fx)) // * = λab.λf.a (bf) typedef Lambda<'a', Lambda<'b', Lambda<'f', Apply<Var<'a'>, Apply<Var<'b'>, Var<'f'> > > > > > Multiply; //  (* 2 3) typedef Eval<Apply<Apply<Multiply, Two>, Three>> Output; //    -    typedef ChurchDecode<Output> Result; std::cout << Result::value; }

And at the output to receive such:

 ilammy@ferocity ~ $ gcc cpp.cpp ilammy@ferocity ~ $ ./a.out 6

The article turned out to be somewhat large, as I wanted to tell you about all the interesting things that are used here. And yet it requires a basic set of knowledge about lambda calculus. The above reviews, average C ++ knowledge (with templates), and common sense should be enough to understand the content.

Under the cut there is another commented constructive proof of the Turing completeness of the C ++ templates in the form of a compile-time interpreter of the typeless lambda calculus (plus cookies in the form of macros and recursion).

(Since the C ++ 03 compilers require < asymmetric < angular < brackets < in < templates>>>>>, and in most cases I am confused by their appearance, then either C ++ 11 support or sed ':l;s/>>/> >/g;tl' necessary for compilation sed ':l;s/>>/> >/g;tl' . With the exception of the macro part, the code does not use any other C ++ 11 features.)

Syntax

Go. As you know, terms in lambda calculus are of three types:

	variable reference	v - variable name
	abstraction	v - variable name, B - term
	application	f and x - terms

All of these constructs must be represented using C ++ templates. The values in the template language are types. Each of these types must carry information about its components. Declaring them, we introduce the axiom "the following expressions exist":

 template <class Var> struct Ref; //  template <class Var, class Exp> struct Lam; //  template <class Fun, class Arg> struct App; //

However, variables in lambda calculus are not full values. The variable v does not make sense by itself, without an abstraction associating it with any value. Therefore, a term is a reference to a variable , and not the variable itself. So the definition can be slightly simplified, making it easier to write programs:

 template <char Var> struct Ref; template <char Var, class Exp> struct Lam; template <class Fun, class Arg> struct App;

We used the possibility of templating using integer values. That's just the values of the char can also be written as strings, which is very convenient: you do not need to declare each variable used in the program in advance. At the same time, this clearly shows that the variables themselves are not values (that is, C ++ types).

Fun fact

Variable names can be made up of several characters: 'foo' . However, apologists for lambda calculus consider it a luxury. Secondly, the value of such symbolic literals is left to the discretion of the compiler ( 2.14.3 / 1 ), which in rare cases can lead to name collisions.

Now we can write with the help of templates any lambda terms!

 typedef Lam<'f', App<Lam<'x', App<Ref<'f'>, App<Ref<'x'>, Ref<'x'>>>>, Lam<'x', App<Ref<'f'>, App<Ref<'x'>, Ref<'x'>>>>>> Y;

What's next?

Knowing how to write terms, you need to learn how to calculate them. Here everything is not so simple, so let's apply a fairly well-known approach to writing hellish magic on templates: first write everything in some declarative programming language, then just fix the syntactic differences. I personally prefer Scheme, so I will use it. With equal success, any other functional language like Haskell can be applied. (And even better - Prolog.)

We will not reinvent the wheel to record the terms themselves, but use traditional lists:

 ; Y = λf.(λx.f (xx)) (λx.f (xx)) (define Y '(lambda (f) ((lambda (x) (f (xx))) (lambda (x) (f (xx))) )))

Fix the lambda calculus syntax:

 ;  —  , ,  . (define (term? exp) (or (ref? exp) (lam? exp) (app? exp)) ) ;   . (define (ref? exp) (symbol? exp)) ;  —     . (define (lam? exp) (and (list-of? 3 exp) ;   —   lambda. (eqv? 'lambda (first exp)) ;  —     . (list-of? 1 (second exp)) (symbol? (first (second exp))) ;   —  . (term? (third exp)) ) ) ;  —     . (define (app? exp) (and (list-of? 2 exp) (term? (first exp)) (term? (second exp)) ) )

Next, we define the function eval , which will interpret the terms, distinguishing them by syntax:

 (define (eval exp) (cond ((ref? exp) (eval-ref exp)) ((lam? exp) (eval-lam (first (second exp)) (third exp))) ((app? exp) (eval-app (first exp) (second exp))) (else (error "eval: syntax error" exp)) ) )

Fine. But how to implement eval-ref ? Where does the interpreter know the value of a variable? For this there is such a thing as environment . Connections between variables and their values are preserved in environments. Therefore, eval actually looks like this - with an additional argument:

 (define (eval exp env) (cond ((ref? exp) (eval-ref exp env)) ((lam? exp) (eval-lam (first (second exp)) (third exp) env)) ((app? exp) (eval-app (first exp) (second exp) env)) (else (error "Syntax error" exp)) ) )

Now, the calculation of the reference to a variable is simply determined - this is a search for the value of a variable in the environment:

 (define (eval-ref var env) (lookup var env) )

The value of the abstraction must be an anonymous function of one argument. This function will cause the application when the time comes. The meaning of abstraction is to calculate your body exp in an environment where the abstraction variable var has the value of the argument passed to arg . The bind function will take care of creating such an environment.

 (define (eval-lam var exp env) (lambda (arg) (eval exp (bind var arg env)) ) )

The values of the other variables may vary, but specifically in lambda calculus it is assumed that they remain the same as in the place where abstraction is defined (that is, they were taken from the env environment). Because of the property of preserving the original environment, such functions are called closures .

Finally, an application is the application of the value of the abstraction fun to the value of the argument arg .

 (define (eval-app fun arg env) ((eval fun env) (eval arg env)) )

Here, first the abstraction and its argument are calculated, and then the call is made. Accordingly, eval performs a reduction of lambda terms in the applicative order (with a call by value).

It remains only to define a couple of functions for working with environments. lookup to find the value of a variable in the environment and bind to create new environments:

 (define (bind var val env) ;     : (cons (cons var val) env) ) ; (bind 'x 1 '((y . 2))) ===> ((x . 1) (y . 1)) ; (lookup 'x '((x . 1) (y . 2))) ===> 1 (define (lookup var env) ; (lookup 'y '((x . 1) (y . 2))) ===> 2 (let ((cell (assq var env))) ; (lookup 'z '((x . 1) (y . 2))) ===> #<ERROR> (if cell (cdr cell) (error "lookup: unbound variable" var) ) ) )

Super! We have a lambda calculator interpreter written in pure functional style. ( All source .) And it even works:

 (eval '((lambda (x) x) B) (bind 'B 42 '()) ) ; ===> 42

But what kind of brackets? And where does C ++ come from?

Looking at the source code of the interpreter in Scheme, it is relatively easy to guess how to write an interpreter of lambda terms on C ++ templates. Let me remind you that the terms themselves are recorded in the following patterns:

 template <char Var> struct Ref; template <char Var, class Exp> struct Lam; template <class Fun, class Arg> struct App;

Template functions (not those)

The interpreter function is called Eval . Since there are no functions in templates, we will have to do with templates alone:

 template <class Exp, class Env> struct Eval;

The invocation of such a function is the instantiation of the template: Eval<Exp, Env> . The return value of this call must be a C ++ type. Let's agree that Eval defines inside of itself a type of value , if its value is defined for the arguments passed:

 typename Eval<Exp, Env>::value

Such code allows to get the value of the Eval call with Exp and Env arguments (in the form of the value type). If a particular call is erroneous (does not make sense), then the value type will not be defined and we will get a compile-time error. Other "template functions" are defined in the same way.

Eval and Apply

Now, using partial template specialization, we can declaratively describe the behavior of Eval . For example, calculating a variable reference is a search for the value of a variable in the environment using the Lookup function (it returns a value in result ):

 template <char Var, class Env> struct Eval<Ref<Var>, Env> { typedef typename Lookup<Var, Env>::result value; };

The result of the abstraction calculation is the closure represented by the Closure template type. A closure stores in itself an (anonymous) function and the environment for defining this function:

 template <char Var, class Exp, class Env> struct Eval<Lam<Var, Exp>, Env> { typedef Closure<Lam<Var, Exp>, Env> value; };

The result of the evaluation of the application is the application of the calculated closure to the calculated argument (performed by the Apply function):

 template <class Fun, class Arg, class Env> struct Eval<App<Fun, Arg>, Env> { typedef typename Apply<typename Eval<Fun, Env>::value, typename Eval<Arg, Env>::value>::result value; };

Thus, Eval defined only for Ref , Lam , App (and environments by the second argument). Eval calls with other arguments are simply not compiled.

Go ahead. Closures are simply interpreter data structures that store two values (a function and its definition environment). Naturally, they are implemented by one more pattern:

 template <class Abs, class Env> struct Closure;

The whole essence of lambda calculus is concentrated in the definition of the Apply function. Closures are calculated in the environment of its definition, which Bind extends by binding the argument of the calculated function to its actual value:

 template <class Fun, class Arg> struct Apply; template <char Var, class Exp, class Env, class Arg> struct Apply<Closure<Lam<Var, Exp>, Env>, Arg> { typedef typename Eval<Exp, Bind<Var, Arg, Env>>::value result; };

(Note that, in principle, Apply can be defined for anything at all, not just for the application of abstractions.)

Lookup and Bind

It remains to deal with environments. For starters, it would be nice to have an empty environment:

 struct NullEnv;

Next, you need to implement an environment extension mechanism using Bind . This data type defines a new environment in which the variable Var is associated with the value of Val , and the values of the other variables are determined by the environment Env :

 template <char Var, class Val, class Env> struct Bind;

Got a kind of coherent list on the templates.

Finally, we need to be able to find the value of the desired variable in this list — it will be in the first element of the list with the same name. The Lookup function performs the search:

 template <char Var, class Env> struct Lookup;

There is nothing in empty surroundings. If the desired value is in the current environment, then return it. Otherwise, we recursively scan other environments:

 template <char Var> struct Lookup<Var, NullEnv>; template <char Var, class Val, class Env> struct Lookup<Var, Bind<Var, Val, Env>> { typedef Val result; }; template <char Var, char OtherVar, class Val, class Env> struct Lookup<Var, Bind<OtherVar, Val, Env>> { typedef typename Lookup<Var, Env>::result result; };

the end

So in 50 lines, we completely defined the syntax and semantics of lambda calculus using C ++ templates, which proves the Turing completeness of the C ++ template disclosure mechanism (assuming unlimited amount of available memory).

The definition of a Turing machine ( English ) can be crammed into about the same amount of strings, but it will still be more verbose due to a more complex structure. The definition of mu-recursive functions ( English ) will probably be shorter, but not much. It would be interesting to look at the implementation of the Markov algorithms, but I did not find it, it is difficult to estimate the size of the source code, but it’s too lazy to write.

Okay, let's try to carry out the simplest calculation, using the obtained features ( full code ):

 /* ,   */ #include <iostream> int main() { // 1 = λfx.fx typedef Lam<'f', Lam<'x', App<Ref<'f'>, Ref<'x'>>>> One; // 2 = λfx.f (fx) typedef Lam<'f', Lam<'x', App<Ref<'f'>, App<Ref<'f'>, Ref<'x'>>>>> Two; // + = λab.λfx.af (bfx) typedef Lam<'a', Lam<'b', Lam<'f', Lam<'x', App<App<Ref<'a'>, Ref<'f'>>, App<App<Ref<'b'>, Ref<'f'>>, Ref<'x'>>> >>>> Plus; // Output := (+ 1 2) typedef Eval<App<App<Plus, One>, Two>, NullEnv>::value Output; // --...    ? Output::invalid_field; }

 ilammy@ferocity ~/dev/tlc $ g++ -std=c++11 lc.cpp lc.cpp:   «int main()»: lc.cpp:79:5: :   «Output {aka Closure<Lam<'f', Lam<'x', App<App<Ref<'a'>, Ref<'f'> >, App<App<Ref<'b'>, Ref<'f'> >, Ref<'x'> > > > >, Bind<'b', Closure<Lam<'f', Lam<'x', App<Ref<'f'>, App<Ref<'f'>, Ref<'x'> > > > >, NullEnv>, Bind<'a', Closure<Lam<'f', Lam<'x', App<Ref<'f'>, Ref<'x'> > > >, NullEnv>, NullEnv> > >}»      Output::invalid_field; ^

In general, this way of executing programs is also quite good, but I would like to be more comfortable.

Expansion of multiple terms

First, the closure of the pattern interpreter is its internal data structures. Only C ++ types correspond to them, but not values. It is necessary to work with them inside the interpreter itself and never take it beyond the limits of the template mechanism. (Therefore, they are intentionally left undefined types.)

Secondly, it is obvious that in the case of presenting the arguments in the form of Church numbers, the result of their addition will also be the Church number — a function of two arguments that applies the first argument to the second one N times. (That's why we got the closure at the output, as gcc output says.) But what do we do with this function? After all, as arguments, we can transfer only the same functions to it!

Indeed, now our interpreter understands only pure lambda calculus, in which there are only abstractions, applications and variables (which refer to abstractions or applications). The syntax makes it possible to compose lambda terms solely from these three components. Any violation of this rule results in a compilation error.

In order to be able to decipher the results of calculations, it is necessary to use the applied lambda calculus - in it the set of terms is extended by elements of some objective set. In our case, this will be a set of C ++ data types.

We introduce the corresponding term for them:

 template <class T> struct Inj;

It denotes an injection of type T in the set of terms of lambda calculus.

After expanding the syntax, it is necessary to clarify the semantics of the language — define with the help of Eval value of the new syntactic structure. Well ... since T is an arbitrary value, then Eval knows only that such a value exists. The only meaning that can be given to functions in such conditions is the identity:

 template <class T, class Env> struct Eval<Inj<T>, Env> { typedef T value; };

Now we can pass as arguments the numbers (represented by types):

 struct Zero { static const unsigned int interpretation = 0; };

It remains to figure out how to pass a function, and we can turn Church numbers into normal int numbers. After all, if you apply the increment function to zero N times, then you end up with a natural number N.

Imagine that we somehow managed to do this by passing the Succ (successor) and Zero types to the interpreter. Let us follow what happens when calling such a function:

 Eval<App<Inj<Succ>, Inj<Zero>>, Env>::value Apply<Eval<Inj<Succ>, Env>, Eval<Inj<Zero>, Env>>::result Apply<Succ, Zero>::result

Bingo! To determine the behavior of Succ you need to specialize Apply for it! (Such conversions are called delta rules .)

For example, this is how the increment function is defined:

 struct Succ; template <class N> struct Apply<Succ, N> { typedef struct _ { static const unsigned int interpretation = N::interpretation + 1; } result; };

The return value of the Apply must be a type declared as result . Therefore, the result of the increment is a data type that is structurally identical to the Zero described above. This allows natural numbers to be represented by non-template data types, while retaining the ability to get a regular int with the corresponding value.

Now you can finally output the result of the addition! ( Full code .)

 /* ,   */ #include <iostream> int main() { // 1 = λfx.fx typedef Lam<'f', Lam<'x', App<Ref<'f'>, Ref<'x'>>>> One; // 2 = λfx.f (fx) typedef Lam<'f', Lam<'x', App<Ref<'f'>, App<Ref<'f'>, Ref<'x'>>>>> Two; // + = λab.λfx.af (bfx) typedef Lam<'a', Lam<'b', Lam<'f', Lam<'x', App<App<Ref<'a'>, Ref<'f'>>, App<App<Ref<'b'>, Ref<'f'>>, Ref<'x'>>> >>>> Plus; // Sum = (+ 1 2) typedef App<App<Plus, One>, Two> Sum; // Result := (Sum +1 =0) typedef App<App<Sum, Inj<Succ>>, Inj<Zero>> Output; typedef Eval<Output, NullEnv>::value Result; std::cout << Result::interpretation; }

 ilammy@ferocity ~/dev/tlc $ g++ -std=c++11 lc.cpp ilammy@ferocity ~/dev/tlc $ ./a.out 3

Global environment and free variables

I think you have already noticed that in main() we always call Eval with an empty environment, and all the necessary abstractions are inline. But this is not at all necessary. If we pass some environment to the first call of Eval , then it will play the role of a global one for the term being calculated: it is this that sets the values of free variables - those that are not connected with anything by lambda abstractions.

However, it is impossible to just take and put the function-type in the environment. In the environments there are values , so first they need to be calculated ( full code ):

 /* ... */ #include <iostream> int main() { /*  One, Two, Plus  */ // Unchurch = λn.(n +1 =0),     int typedef Lam<'n', App<App<Ref<'n'>, Ref<'I'>>, Ref<'O'>>> Unchurch; // Result := (Unchurch (+ 1 2)) typedef Eval<App<Ref<'U'>, App<App<Ref<'+'>, Ref<'1'>>, Ref<'2'>>>, Bind<'+', typename Eval<Plus, NullEnv>::value, Bind<'1', typename Eval<One, NullEnv>::value, Bind<'2', typename Eval<Two, NullEnv>::value, Bind<'U', typename Eval<Unchurch, Bind<'I', Succ, Bind<'O', Zero, NullEnv>> >::value, NullEnv>>>> >::value Result; std::cout << Result::interpretation; }

Environments represent interpreter memory. If it were a compiler with a linker, then there would have to be compiled functions. In the case of the interpreter, they are in a “predicted” state — already passed through Eval .

You should also pay attention to the free variables Unchurch . They enter the environment without any Inj around. This is also because in the memory of the interpreter these values are represented in this way. Inj needed only for recording them in the text of programs (in lambda terms).

Macros

Has anyone not had enough to write the functions of several arguments manually to curry them? And all these constant Ref<'foo'> ?

And by the way, even in lambda calculus, convenient abbreviations have been adopted:

Before	After

Let's implement the same.

There are many approaches to the implementation of macros. For our case, it is worth choosing the simplest one: an external preprocessor. "External" means that macro definitions are outside the program being processed. That is, we will not introduce any new syntax for lambda calculus to express macros inside it; it would be too difficult. The macro processor is simply “set off” on the program and the output produces a pure lambda term - for example, the MOC also works in Qt.

Two phases

Up to this point in the life of our programs there was only one important event - the determination of their meaning with the help Eval. Now it will add more macro expansion with Expand. All that is served at the entrance Eval, you must first pass through Expand. We introduce a new convenient function Computethat combines these actions:

 template <class Exp> struct Expand; template <class Exp, class Env> struct Compute { typedef typename Eval<typename Expand<Exp>::result, Env>::value value; };

Expandaccepts only one argument. We assume that this is a black box: at the entrance a program with macros, at the exit - without them. In our simple case, we do not need any macroenvironment.

Implementation of macros

Now macros obviously need to be implemented inside Expand. We need some Lam_and App_that are disclosed as follows:

Before	After
`Lam_<'x', 'y', ..., 'z', B>`	`Lam<'x', Lam<'y', ..., Lam<'z', B>...>>`
`App_<A, B, C, ..., Z>`	`App<...App<App<A, B>, C>, ..., Z>`
`App_<'a', ...>`	`App_<Ref<'a'>, ...>`

In C ++ 11, templates of arbitrary arity appeared, which makes the task easier. Owners of C ++ compilers can only suffer and write a thousand and one specializations: Lam_2for two arguments, Lam_3for three, App_4for four, etc. Well, or pervert even harder and repeat everything shown below with the help of the native C ++ preprocessor .

Lam_

True, even the C ++ 11 templates have their limitations, so the syntax will have to be cut a little more. A bundle of arguments can only be the last argument of the template, so for this Lam_you need to enter a special "argument holder":

 template <char... Vars> struct Args; template <class Args, class Exp> struct Lam_; template <char Var, class Exp> struct Expand<Lam_<Args<Var>, Exp>> { typedef typename Expand<Lam<Var, Exp>>::result result; }; template <char Var, char... Vars, class Exp> struct Expand<Lam_<Args<Var, Vars...>, Exp>> { typedef Lam<Var, typename Expand<Lam_<Args<Vars...>, Exp>>::result> result; };

Pay attention to repeated calls Expandin the process of disclosure. They are necessary, as Expand<...>::result it must always be a pure lambda term without macros.

App_

Also, C ++ 11 templates do not allow to mix arguments-numbers and type-arguments in a bundle, so you App_will have two corresponding options. Pechalka.

Implementations App_s(for characters) and App_i(for types) are more voluminous, so the explanations and code are hidden under the spoiler.

Hidden text

. , -. . C++11 map , , . ( , map ? Anyone? )

 struct Nil; template <class First, class Rest> struct RefList;

, . Ref :

 template <char... Vars> struct ToRefList_s; template <char Var> struct ToRefList_s<Var> { typedef RefList<Ref<Var>, Nil> result; }; template <char Var, char... Vars> struct ToRefList_s<Var, Vars...> { typedef RefList<Ref<Var>, typename ToRefList_s<Vars...>::result> result; };

, . .

 template <class... Exps> struct ToRefList_i; template <class Exp> struct ToRefList_i<Exp> { typedef RefList<Exp, Nil> result; }; template <class Exp, class... Exps> struct ToRefList_i<Exp, Exps...> { typedef RefList<Exp, typename ToRefList_i<Exps...>::result> result; };

. App_wrap — ! RefList «» App . — «» App , — RefList .

RefList . . , ( Nil ).

 template <class Apps, class RefList> struct App_wrap; template <class A, class D, class R> struct App_wrap<Nil, RefList<A, RefList<D, R>>> { typedef typename App_wrap<App<A, D>, R>::result result; }; template <class Apps, class A> struct App_wrap<Apps, RefList<A, Nil>> { typedef typename App_wrap<App<Apps, A>, Nil>::result result; }; template <class Apps, class A, class D> struct App_wrap<Apps, RefList<A, D>> { typedef typename App_wrap<App<Apps, A>, D>::result result; }; template <class Apps> struct App_wrap<Apps, Nil> { typedef Apps result; };

? , .

Ultimately, we write the following simple definitions. They call the functions examined under the spoiler, which do all the dirty work, after which they themselves once again drive the result through Expand, so that the output does not have undisclosed macros.

 template <char... Exps> struct App_s; template <class... Exps> struct App_i; template <char... Exps> struct Expand<App_s<Exps...>> { typedef typename Expand< typename App_wrap<Nil, typename ToRefList_s<Exps...>::result >::result >::result result; }; template <class... Exps> struct Expand<App_i<Exps...>> { typedef typename Expand< typename App_wrap<Nil, typename ToRefList_i<Exps...>::result >::result >::result result; };

Rest

And last, but important addition. The expander must correctly handle embedded language constructs, expanding macros where they can meet:

 template <char Var> struct Expand<Ref<Var>> { typedef Ref<Var> result; }; template <char Var, class Exp> struct Expand<Lam<Var, Exp>> { typedef Lam<Var, typename Expand<Exp>::result> result; }; template <class Fun, class Arg> struct Expand<App<Fun, Arg>> { typedef App<typename Expand<Fun>::result, typename Expand<Arg>::result> result; };

Who noticed in the work Expandparallels with Evaland Applythat fellow.
( Full code with macros support.)

Recursion

Having said about the Turing completeness of the obtained calculator, we somehow avoided one moment, being satisfied with simple arithmetic. After all, Turing-complete system allows you to express cycles ! Consider cyclic calculations on the example ( drum roll ) factorial.

In lambda calculus, recursion cannot be expressed directly, since abstractions have no names. To record the recursion in its usual form, you need a magic operator Rec, which is similar to the usual abstraction Lam, but creates abstractions with an additional argument - a reference to the abstraction itself.

However, clever mathematicians found a way around this limitation: the so-called Y-combinatorallows to express recursive functions as solutions of functional equations of fixed points. What is it and how the previous sentence is deciphered can be read elsewhere, but now it is important that the Y-combinator is written like this:

 // Y = λf.(λx.f (xx)) (λx.f (xx)) typedef Lam<'f', App<Lam<'x', App<Ref<'f'>, App_s<'x', 'x'>>>, Lam<'x', App<Ref<'f'>, App_s<'x', 'x'>>>>> Y;

To express with its help the recursive function that calculates the factorial, you first need to determine the mathematical and logical operators necessary for it: multiplication, subtraction, comparison. Also, a more convenient way to write Church numbers does not hurt. All definitions can be viewed here (the last four sections), they are standard definitions of similar functions in lambda calculus. (If you have mastered reading up to this point, then there should be no problems with understanding their implementation.)

Ultimately, all the necessary definitions are associated with the appropriate symbols and fit into the environment StandardLibrary. Now you can beautifully and conveniently write the generating function for factorial:

 // F = λfn.if (= n 0) 1 // (* n (f (- n 1))) typedef Lam_<Args<'f', 'n'>, App_i<Ref<'?'>, App_s<'=', 'n', '0'>, Ref<'1'>, App_i<Ref<'*'>, Ref<'n'>, App_i<Ref<'f'>, App_s<'-', 'n', '1'>>> > > Fact_gen;

It differs from the usual unary one in that it takes an additional argument f— the function of calculating the factorial (already usual). The role of the Y-combinator is to Fact_gendetermine a function from the generating function so Factthat App<Fact_gen, Fact>≡ Factis called: “find a fixed point”.

Ok, let's try to apply all this together, calculating (YF 1) - factorial of one:

 /* ... */ #include <iostream> int main() { /*  Y  Fact_gen */ typedef Compute<App<Ref<'U'>, App_i<Y, Fact_gen, MakeChurch(1)> >, StandardLibrary>::value Result; std::cout << Result::interpretation; }

 ilammy@ferocity ~/dev/tlc $ g++ -std=c++11 lc.cpp 2>&1 | wc -c 64596

Not compiled. And the error log at 64 kilobytes. Why?

It's all about calculation. A normal Y-combinator is written based on the normal calculation order. In it, a piece f (xx)will first cause a substitution (xx)in the body f, and only then, if it is needed, the value (xx)will be calculated (also with a lazy substitution).

In the case of an applicative order (call by value), this expression is evaluated immediately, which obviously leads to an infinite loop (if you look at what the argument x should be equal to). For example, the Scheme lambda interpreter, shown earlier, loops.

If you decrypt the log spat out gcc, then it says there that:

 struct Apply< Closure< Lam<'x', App<Ref<'f'>, App<Ref<'x'>, Ref<'x'>>>>, Bind<'f', Closure<Fact_gen, StandardLibrary>, StandardLibrary > >, Closure< Lam<'x', App<Ref<'f'>, App<Ref<'x'>, Ref<'x'>>>>, Bind<'f', Closure<Fact_gen, StandardLibrary>, StandardLibrary > > >

does not define the type result, that is, this call does not make sense. The compiler saw and broke an infinite pattern substitution loop, which is undefined by the standard ( 14.7.1 / 15 ).

C ++ templates also perform calculations in the applicative order, because the function call typename Eval<Exp, Env>::valueis instantiation of the template. Instantiation Evalobviously requires instantiation Expand Env.

In languages with an applicative order of computation, a Z-combinator should be used - a modification of the Y-combinator, in which the expression is (xx)wrapped in an abstraction, which prevents its premature calculation:

 // Z = λf.(λx.f (λy.xxy)) (λx.f (λy.xxy)) typedef Lam<'f', App<Lam<'x', App<Ref<'f'>, Lam<'y', App_s<'x', 'x', 'y'>>>>, Lam<'x', App<Ref<'f'>, Lam<'y', App_s<'x', 'x', 'y'>>>>>> Z;

Now compilation errors are not visible, as well as its end. Obviously, this time we outwitted the compiler and made it infinitely recursively reveal something. It is reasonable to assume that this can only be a recursive call of factorial.

Stop!And when does the compiler have to stop substituting factorial for itself? I mean, when do we want him to stop doing that? In theory, the operator is responsible for this If: when the factorial argument is zero, you need to return one, and not try to make a recursive call. How is it determined If?

 typedef Lam_<Args<'c', 't', 'f'>, App_s<'c', 't', 'f'>> If;

As an abstraction. It seems to be all right, the standard definition for Boolean Church constants ... That's just it is also designed for the normal reduction order! In the applicative order, it Ifcalculates both branches immediately with the condition, and only after that does it make a choice.

The problem can be solved in a manner similar to the Z-combinator: wrap the deferred calculations into an abstraction. However, in the case of the Ifcondition to wrap just not necessary. Therefore, unfortunately, If cannot be a convenient function in applicative languages. However, it can be made a macro!

 template <class Cond, class Then, class Else> struct If; template <class Cond, class Then, class Else> struct Expand<If<Cond, Then, Else>> { typedef typename Expand< App_i<Cond, Lam<'u', Then>, Lam<'u', Else>, Lam<'u', Ref<'u'>> > >::result result; };

Strictly speaking, a variable 'u'must not coincide with any free variable Thenor Else. Our macrosystem does not provide such an opportunity (hygiene). In general, we have a very limited number of variable names. Therefore, we will reserve an identifier 0as a non-matching identifier:

 template <char Var> struct Lookup<Var, NullEnv>; template <class Env> struct Lookup<0, Env>; template <char Var, class Val, class Env> struct Lookup<Var, Bind<Var, Val, Env>> { typedef Val result; };

Now finally factorial will work. ( Full code .)

 ilammy@ferocity ~/dev/tlc $ time g++ -std=c++11 lc.cpp -DARG=6 real 0m12.630s user 0m11.979s sys 0m0.466s ilammy@ferocity ~/dev/tlc $ ./a.out 720

Unfortunately, I could not wait for the calculation of factorial seven, let alone that on 32-bit systems the compiler simply dies from a stack overflow. But still!..

Conclusion

There is probably no practical engineering benefit from this interpreter, but the idea itself is amazing. C ++ templates are from “inadvertently turing-full” things in theoretical computer science. A similar feeling came to me only when I learned that the paging memory management subsystem of x86 processors is also Turing-complete. If this interpreter allows you to perform calculations without performing any C ++ operator, then the MMU allows you to perform calculations without performing any machine program instructions.

Unfortunately, I have finished reading DCPL so far only up to the eighth chapter, therefore writing the interpreter of a typed lambda calculus is left to the readers as an exercise. I still have too weak mathematical preparation for this.

PS , , : « , ».

Source: https://habr.com/ru/post/218341/

All Articles