So you still don't understand Hindley-Milner? Part 2

In Part 1, we talked about what building blocks are needed to formalize Hindley-Milner, and in this post we specify their definitions and formulate the formalization as a whole:

Formalization of the concept of expression

We will give a recursive definition of what an expression is; in other words, we will define most of the types of expressions, describe how to create new, more complex expressions, based on existing ones, and show that only valid expressions can be created in this way.

Variables are valid expressions.
If e is an arbitrary expression, and x is an arbitrary variable, then λx.e is an expression. Here it helps to think about e as an ordinary (but not necessarily) complex expression that includes x , i.e. x ² +2 , and then about λx.e as an anonymous function, which takes input x and returns the result of evaluating the expression e for the given value x . Simply put, think of it as
```
 function(x) { return x^2 + 2; } 
```
if f and e are valid expressions, then f(e) also valid. It is called an “overlay” ( Application ) for obvious reasons.
If x is a variable, and e ₁ and e ₀ are valid expressions, then replacing every occurrence of x in e ₀ with e ₁ will also give a valid expression. That is, if, for example, e ₁ = x ² +2 and e ₀ = y/3 , then putting x = e ₀ in e ₁ , we get the expression (y/3) ² +2 .
[NB: The last item is redundant and is not officially included in the definition of lambda calculus for expressions, since the substitution of e ₀ as x in e _{1 is} equivalent to using the abstraction λx. e ₁ to e ₀ . It is added only to support so-called let polymorphism .]

Nothing is more than a valid expression.
')
Aside: anyone who pays enough attention to this question will be surprised: “Just a minute, how can I make some useful expression out of this? How can I get x ² +2 (or at least 2), based on the above? Damn, what about zero? There is no line in these rules, which obviously leads to the expression 0 ". The solution in this case is to create an expression in lambda calculus that behaves like 0.1, ..., +, ×, -, / with the correct interpretation. In other words, we should encode numbers, arithmetic operations, strings, etc. into patterns that we can create using lambda syntax. In this post there is a small, but very good section on numbers and arithmetic operations. This is an interesting feature of lambda calculus: we have an elementary syntax that we can recursively define with four simple points, but it allows us to inductively prove many things in four basic steps, since the language itself has expressive power to write numbers, strings and all types operations that we may need.

Formalization of type statements

Let e be any expression such that " e " is a variable in our meta-language, denoting any expression from our base language. For example, such as any of the following:

 x Math.pow(x,2) [1,2].forEach ( function(x) { print(x); } )

Then, if t is an arbitrary type, then we can express " e has type t " through
e:t

Like e , t is a variable in our meta-language that can correspond to any type in the base language (int, string, etc.). And just like for e , for t compliance with any particular type is not required.

We can give a formal definition of what is considered a type, as we did above for expressions. However, the abstraction in this case is quite intricate, so we omit this moment. I simply draw your attention to two key points that should be kept in mind:

If s and t are types, then t → s; Is the type of the function with t at the input and s at the output
If r is a type, possibly composed of other types (just as t → s is composed of t and s, each of which, in turn, can potentially be represented as a composition of some other types), and α is a variable of this type, then ∀ α.r is a type.
Without an example, the above sounds a bit pointless:
```
 function (x) { return x; } 
```
This function is of type String → String . Or Int → Int . In fact, for any type t its type is t → t . We will say that it is of type ∀ tt → t . Each of the types String → String or t → t is a “monotype”. ∀ tt → t is a “polytype”. The function identical to that written above has an abstract polytype ∀ tt → t , which in practice means that for any real type t it has type t → t . Putting all of the above together, we get just such a compact summary record:
λ xx: ∀ α.α → α

Formalization of statements about statements about types

Now we want to formalize a branch of rules on how we can proceed from the knowledge of some expressions and their types to the derivation of the types of a greater number of expressions. Remember how sentence calculus formulates Modus Ponens? We are going to do something similar. For example, suppose we want to formalize the following part of the reasoning:

Suppose I am already able to infer that the variable duck is of type Animal .
Moreover, suppose that I deduced that speak is a method of type Animal -> String .
Then I can deduce that speak(duck) is of type String .

And any reasoning that fits into such a form is a valid type inference.

We formalize it as follows:

This rule is called [App] (for overlays), and it is one of those that are present in this issue on StackOverflow . We will talk about him and the remaining rules in the next post. In the meantime, let's deal with all the characters that we met above:

Γ - corresponds to a set of positions that we already know, or maybe we can assume. More generally, Γ is just a reflection on a certain collection of positions (on expressions and their types). And naturally, there is nothing special about the letter Γ - capital Greek letters are often used to designate sets of positions.
⊢ - “turnstile”, meaning that something can be displayed. For example, Γ ⊢ x:t means that if we take a statement from Γ as an assumption / axiom / existing knowledge, then we can infer that x is of type t .
∈ - "epsilon", means belonging to something. x:t ∈ Γ says that the statement x:t belongs to Γ.
That long horizontal line. This line tells us that we can draw conclusions, located in the denominator, if we take the numerator as the original premise. Which allows us to express things like: "If we can derive this and this, then we can also derive this and that." For example:

If we can infer that y has type σ from Γ, then we can infer that x has type τ from Γ.

Next step:

Part 3 , in which we will put everything together and get an idea of the rules used in the HM algorithm.

Translator's note: I would be very grateful for any comments in PM about the translation.

Source: https://habr.com/ru/post/185686/

All Articles

So you still don't understand Hindley-Milner? Part 2

Formalization of the concept of expression

Formalization of type statements

Formalization of statements about statements about types

More articles: