Dependent Types - The Future of Programming Languages

Hello!

Despite the outlandishness and some distraction of the topic under consideration today - we hope that she will be able to diversify your weekend. At the end of the post we put three links from the author, allowing you to get acquainted with dependent typing in Idris, F * and JavaScript

Sometimes it seems as if programming languages have hardly changed since the 60s. When they talk to me about it, I often remember how many cool tools and opportunities we now have, and how they simplify life. Offhand: it’s integrated debuggers, unit tests, static analyzers, cool IDEs, typed arrays and much more. The development of languages is a long and progressive process, and there are no such “silver bullets” with the appearance of which the development of languages would change once and for all.

Today I want to tell you about one of the final stages in this ongoing process. The technology, which will be discussed, is being actively explored, but everything indicates that it will soon take root in mainstream languages. And our story begins with one of the most fundamental concepts in computer science: with types .
')

World of types

Typification is one of those things that are so inseparable from our thinking that we hardly even think about the concept of types as such? Why is 1 - this is int , but if you only put this value in quotes - and it turns into a string ? What is “type”, in essence? As is often the case in programming, the answer depends on the wording of the question.

Types are diverse. In some type systems, there are very clear boundaries between types and values. So, 3, 2 and 1 are values of type integer , but integer is not a value. This construct is “embedded” in the language and is fundamentally different from the meaning. But, in fact, such a distinction is not necessary and can only limit us.

If you release the types and turn them into another category of values, a number of amazing possibilities open up. Values can be stored, converted and passed to functions. Thus, it would be possible to make a function that takes a type as a parameter, creating generic functions: those that can work with many types without overloads. You can have an array of values of a given type, rather than doing strange pointer arithmetic and type casting, as you have to do in C. You can also collect new types as the program runs and provide features such as, for example, JSON automatic deserialization. But, even if you interpret types as values, all the same with types you will not do all that you can do with values. So, using user instances, you can, for example, compare their names, check their age or identifier, etc.

 if user.name == "Marin" && user.age < 65 { print("You can't retire yet!") }

However, if you tried to do the same with the User type, you could only compare type names and, possibly, property names. Since this is a type, not an instance, you cannot check the values of its properties.

 if typeof(user) == User { print("Well, it's a user. That's all I know") }

How cool would it be if we had a function that could only get a non-empty list of users? Or a function that would accept an email address only if it was recorded in the correct format? For these purposes, you would need the types of "non-empty array" or "email address". In this case we are talking about a type that depends on the value, i.e. about dependent type . In mainstream languages, this is not possible.

For the types to be used, the compiler must check them. If you claim that the variable contains an integer, then it would be better if it did not have a string , otherwise the compiler will swear. In principle, this is good, because it does not allow us to sabotage. It is very simple to check types: if the function returns an integer , and we are trying to return "Marin" in it, then this is an error.

However, with dependent types everything is more complicated. The problem is when exactly the compiler checks the types. How can he make sure that there are three values in the array, if the program is not even running yet? How to make sure that an integer is greater than 3 if it is not even assigned yet? To do this, there is magic ... or, in other words, mathematics . If you can mathematically prove that the set of numbers is always greater than 3, then the compiler can verify this.

Math in the studio!

Mathematical induction is used to formulate evidence. Induction allows one to unconditionally confirm the truth of some statement. For example, we want to prove that the following mathematical formula holds for any positive number:

 1 + 2 + 3 + ... + x = x * (x + 1) / 2

There are an infinite number of possible x, so it would take us a very long time to check all the numbers manually. Fortunately, this is not necessary. We just need to prove two things:

This statement is respected for the first number. (Usually it is 0 or 1)
If this statement is true for the number n , then it will be observed for the next number n + 1

Since the statement is observed for both the first number and all the following numbers, we know that it is true for all possible numbers.

Proving this is not difficult:

 1 = 1 * (1 + 1) / 2 1 = 1

Now we also have to prove that the statement holds for all other numbers. To do this, we assume that it works for some number n, and then we make sure that it also works for n + 1.

Assuming the following expression is true:

 1 + 2 + 3 + ... + n = n * (n + 1) / 2

Check it for n + 1 :

 (1 + 2 + 3 + ... + n) + (n + 1) = (n + 1) * ((n + 1) + 1) / 2

Thus, it is possible to replace "(1 + 2 + 3 + ... + n)" above equation:

 (n * (n + 1) / 2) + (n + 1) = (n + 1) * ((n + 2) / 2)

and simplify to

 (n + 1) * (n/2 + 1) = (n + 1) * (n/2 + 1)

Since both parts of the expression are equal, we are convinced that this statement is true. This is one of the ways in which you can check the truth of statements without manually calculating each case, and dependent types work on the basis of this principle. You write a mathematical statement to make sure that the thesis about the type is correct.

The beauty of this approach lies in the fact that any mathematical proof can be made in the form of a computer program - and this is exactly what we need!

Back to programming

So, we have established that some things can be proved first, and then proceed to specific values. To do this in a programming language, you need a way to express these statements in code that will be written to the type system itself, that is, the type system needs to be improved.

Consider an example. Here we have the append function that takes two arrays and combines them. Typically, the signature of such a function would look something like this:

 append: (arr1: Array, arr2: Array) -> Array

However, just by looking at the signature, we cannot be sure of the correct implementation. The fact that the function returns an array does not mean that it has done something. One way to check the result is to make sure that the length of the resulting array is equal to the sum of the lengths of the parameter arrays.

 newArray = append([1], [2, 3]) assert(length(newArray) == 3)

But why check it at runtime if you can create a constraint that will be checked at program compilation time:

 append: (arr1: Array, arr2: Array) -> newArray: Array where length(newArray) == length(arr1) + length(arr2)

We declare that append is a function that takes two Array arguments and returns a new Array argument, which we called newArray . Only this time we add a clause that the length of the new array should be equal to the sum of the lengths of all the arguments of the function. The statement we had above at runtime is converted to a type at compile time.

The above code refers to the world of types, not values, that is, the == sign indicates a comparison of the return type of length rather than its value. For this mechanism to work, the return type length must give us any information about the actual number.

To ensure the operation of such a mechanism, you need to make sure that each number is of a separate type. Type One can contain only one value: 1. The same applies to Two, Three, and all other numbers. Naturally, such work is very tiring, but it is for this kind of work that we have programming. You can write a compiler that will do it for us.

By doing this, you can create separate types for arrays containing 1, 2, 3, and a different number of elements. ArrayOfOne , ArrayOfTwo , etc.

Thus, you can define the length function, which will accept one of the above types of arrays and have a dependent return type of One for ArrayOfOne , Two for ArrayOfTwo , etc. for every number.

Now that we have a separate type for any particular array length, we can be sure (at compile time) that both arrays are of equal length. To do this, compare their types. And since types are the same values as any others, you can assign operations to them. You can define the addition of two specific types by specifying that the sum of ArrayOfOne and ArrayOfTwo is equal to ArrayOfThree .

That's all the information the compiler needs to make sure that the code you write is correct.

Suppose we want to create a variable of type ArrayOfThree :

 result: ArrayOfThree = append([1], [2, 3])

The compiler can determine that [1] has only one value, so you can assign the type ArrayOfOne . It can also assign ArrayOfTwo to [2, 3].

The compiler knows that the result type must be equal to the sum of the types of the first and second argument. He also knows that ArrayOfOne + ArrayOfTwo is equal to ArrayOfThree, that is, he knows that the entire expression on the right side of the identity is of type ArrayOfThree. It coincides with the expression on the left side, and the compiler is satisfied.

If we wrote the following:

 result: ArrayOfTwo = append([1], [2, 3])

then the compiler would be completely unhappy because it would know that the type is incorrect.

Dependent typing is very cool.

In this case, a huge number of bugs is simply impossible to prevent. With dependent typing, you can avoid errors by one, calls to nonexistent array indexes, null pointer exceptions, infinite loops, and invalid code.

With the help of dependent types, you can express almost anything. The factorial function will accept only natural numbers, the login function will not accept empty strings, the removeLast function will only accept non-empty arrays. Moreover, all this is checked before you run the program.

The problem with runtime checks is that they fail if the program is already running. This is normal if the program is run only by you, but not by the user. Dependent types allow you to bring such checks to the level of types, so this type of failure during the execution of the program becomes impossible.

I think dependent typing is the future of mainstream programming languages, and I can't wait to wait for it!

→ Idris

→ F *

→ Adding dependent types to javascript

Source: https://habr.com/ru/post/432416/

All Articles

Dependent Types - The Future of Programming Languages

World of types

Math in the studio!

Back to programming

Dependent typing is very cool.

More articles: