λ-calculus. Part One: History and Theory

The idea, a short plan and links to the main sources for this article were given to me by the habrauseur z6Dabrata , for which many thanks to him.

UPD: some changes have been made to the text to make it more understandable. Semantic component remained the same.

Introduction

Perhaps this system will find applications not only
in the role of logical calculus. ( Alonzo Church, 1932 )

')
Generally speaking, lambda calculus does not apply to subjects that "every self-respecting programmer should know." This is such a theoretical thing, the study of which is necessary when you are going to study type systems or want to create your own functional programming language. However, if you have a desire to understand what is at the heart of Haskell, ML and the like, “move the assemblage point” to write code or simply expand your horizons, then I ask for cat.

We begin with a traditional (but brief) history. In the 1930s, the so-called resolution problem ( Entscheidungsproblem ), formulated by David Gilbert, faced mathematicians. Its essence is that here we have some kind of formal language in which you can write a statement. Is there an algorithm that determines its truth or falsity in a finite number of steps? The answer was found by two great scientists of the time, Alonzo Church and Alan Turing . They showed (the first one with the help of the λ-calculus invented by him, and the second with the theory of the Turing machine) that for the arithmetic of such an algorithm does not exist in principle, i.e. Entscheidungsproblem is generally undecidable.

So the lambda calculus for the first time loudly declared itself, but for another couple of decades it continued to be the domain of mathematical logic. So far, in the mid-60s, Peter Landin did not notice that a complex programming language is easier to learn by formulating its core in the form of a small basic calculus, expressing the most essential mechanisms of the language and supplemented by a set of convenient derivative forms, the behavior of which can be expressed by translating into basic calculus language. Landin used the Church's lambda calculus as such a basis. And wrap it all ...

λ-calculus: basic concepts

Syntax

At the core of lambda calculus is the notion now known to every programmer, an anonymous function. It has no built-in constants, elementary operators, numbers, arithmetic operations, conditional expressions, cycles, etc. - only functions, ~~only hardcore~~ . Because lambda calculus is not a programming language, but a formal apparatus capable of determining in its terms any language construct or algorithm. In this sense, it is consonant with the Turing machine, only corresponds to the functional paradigm, not the imperative one.

We will consider its most simple form: pure untyped lambda calculus, and this is what exactly will be at our disposal.

Terms :

variable:	`x`
lambda abstraction (anonymous function):	`λx.t` , where `x` is the argument of the function, `t` is its body.
function application (application):	`fx` , where `f` is a function, `x` is the value of the argument substituted into it

Operations Priority Agreements :

The use of the function is left associative. Those. stu is the same as (st) u
Application (application or function call in relation to a given value) takes everything that it reaches. Those. λx. λy. xyx λx. λy. xyx means the same as λx. (λy. ((xy) x)) λx. (λy. ((xy) x))
Brackets clearly indicate the grouping of actions.

It may seem as if we need some special mechanisms for functions with several arguments, but in reality this is not the case. Indeed, in the world of pure lambda calculus, the value returned by a function can also be a function. Therefore, we can apply the original function to only one of its arguments, “freezing” the others. As a result, we obtain a new function from the “tail” of the arguments, to which the previous argument applies. Such an operation is called currying (after that Haskell Curry ). It will look something like this:

`f = λx.λy.t`	Function with two arguments `x` and `y` and the body `t`
`fvw`	Substitute the values of `v` and `w` in `f`
`(fv) w`	This entry is similar to the previous one, but the parentheses clearly indicate the sequence of substitutions
`((λy.[x → v]t) w)`	Substitute `v` instead of `x` . `[x → v]t` means "the body `t` , in which all occurrences of `x` replaced by `v` "
`[y → w][x → v]t`	Substitute `w` instead of `y` . Conversion complete.

And finally, a few words about the scope . A variable x is called bound if it is in the body t λ-abstraction λx.t If x not bound by any overlying abstraction, then it is called free . For example, the occurrences of x in xy and λy.xy free, and the occurrences of x in λx.x and λz.λx.λy.x(yz) are related. In (λx.x)x first occurrence of x is connected, and the second is free. If all variables in a term are connected, then it is called closed , or a combinator . We will use the following simplest combinator ( the identity function ): id = λx.x It does not perform any actions, but simply returns its argument without changes.

Calculation process

Consider the following term application:

(λx.t) y

Its left side - (λx.t) is a function with one argument x and the body t . Each step of the calculation will be to replace all occurrences of the variable x inside t with y . The term-application of this type is named redex (from reducible expression , redex - “abbreviated expression”), and the operation of rewriting redex in accordance with the specified rule is called beta-reduction .

There are several strategies for choosing Redex for the next calculation step. We will consider them on the example of the following term:

(λx.x) ((λx.x) (λz. (λx.x) z)) ,

which for simplicity can be rewritten as

id (id (λz. id z))

(recall that id is a function of an identity of the form λx.x )

This term contains three redexes:

Full β-reduction . In this case, each time the redex inside the calculated term is chosen arbitrarily. Those. Our example can be computed from internal redex to external:
Normal computation order . The leftmost, most external redex is always reduced first.
Call by name . The order of calculations in this strategy is similar to the previous one, but the prohibition on carrying out cuts inside the abstraction is added to it. Those. in our example, we stop at the penultimate step:

An optimized version of such a strategy ( calling as needed ) is used by Haskell. This is the so-called "lazy" calculations.
Call by value . Here the abbreviation starts with the leftmost (external) Redex, whose value in the right side is a closed term, which cannot be calculated further.

For pure lambda calculus, such a term is λ-abstraction (function), and in richer calculi it can be constants, strings, lists, etc. This strategy is used in most programming languages when all arguments are first evaluated, and then all put together into a function.

If there are no more redexes in the term, then they say that it is calculated , or is in normal form . Not every term has a normal form, for example (λx.xx)(λx.xx) at each step of the calculation will generate itself (here the first bracket is an anonymous function, the second is the value substituted into it in place of x ).

The disadvantage of the call-by-value strategy is that it can loop and not find the existing normal value of a term. Consider for example the expression

(λx.λy. x) z ((λx.xx)(λx.xx))

This term has the normal form z despite the fact that its second argument does not possess such a form. On its calculation, the call-by-value strategy will hang, while the call-by-name strategy will start from the outermost term and determine there that the second argument is not needed in principle. Conclusion: if redex has a normal form, then the "lazy" strategy will definitely find it.

Another subtlety is associated with the naming of variables. For example, the term (λx.λy.x)y after substitution is calculated in λy.y Those. because of the coincidence of the names of the variables, we obtain the identity function where it was not originally intended. Indeed, if we call a local variable not y , the initial term z would have the form (λx.λz.x)y and, after reduction, would look like λz.y To exclude ambiguities of this kind, it is necessary to clearly monitor that all free variables from the initial term remain free after the substitution. For this purpose, α-conversion is used — renaming a variable into an abstraction in order to eliminate name conflicts.

It also happens that we have an abstraction λx.tx , and x free entries in the body t . In this case, this expression will be equivalent to just t . This conversion is called η-conversion .

This concludes the introductory in lambda calculus. In the next article, we will deal with the purpose for which everything was started: programming on λ-calculus.

List of sources

Source: https://habr.com/ru/post/215807/

All Articles