📜 ⬆️ ⬇️

View design pattern in languages ​​with dependent types



Design patterns! I first learned about them in the Software Design course when I was studying in the magistracy of the Academic University. We wrote various Java programs using templates. Since then, I have associated this phrase with something such OOPshny. However, while figuring out the Agda language, I came across the article The Power Of Pi, which tells about design patterns in languages ​​with dependent types!

In this post I want to talk about one of these templates, which is called View. With it, you can implement custom pattern matching'a rules. If you are wondering what this pattern is, what is custom pattern matching, what are the features of pattern matching in languages ​​with dependent types, and you are familiar with some functional programming language with static typing (Haskell, Scala, Ocaml, F #) - welcome under the cat!
')

Views and custom pattern matching


To begin with, let's see what custom pattern matching is. Suppose we work with some xs list and perform pattern matching on it:
match xs with [] -> ... (y : ys) -> ... 

Here we look at xs either as an empty list or as a list consisting of a head element and a list assigned to it. But what if we want to look at xs , say, as a list consisting of the concatenation of some two lists ys and zs ? Sort of
 match xs with [] -> ... (ys ++ zs) -> ... 

In other words, we want to influence how the data structure will be represented as a result of pattern matching'a. This is what we will call custom pattern matching.

In the course of the article, our goal will be to implement custom pattern matching for the following situation. Suppose we work with bit vectors of fixed length. At the same time, we have the ability to specify the length of the vector directly in the type. For example, writing bits: [32] means that bits is a 32-bit vector. We, in the process of pattern matching, want to be able to split the vector into equal parts in various ways, for example, like this:
 swapAB : [32] -> [32] swapAB [abcd] = [bacd] 

This function accepts a 32-bit vector as input, breaks it into 4 parts, each of which is an 8-bit vector, and swaps the first two 8-bit words. She, with the same success, could break it into 2 parts, each 16-bit. Such functionality can be useful when writing cryptographic software. For example, something like this is the basis of the Cryptol language, which is designed to write cryptographic algorithms. Let's see how to implement this on Agda.

The general scheme for implementing custom pattern matching will be as follows:
  1. Implement the View template for the original data type
    1. Determine the type of data View, constructors of which represent the values ​​of the source type in the desired form
    2. Write a function view, which by value of the original data type builds a value of type View
  2. Use pattern matching on the results of calling the view function.

Before dealing with these items, we define some basic data types.

Natural numbers and vector in Agda


Natural numbers are defined recursively:
 data Nat : Set where Zero : Nat Suc : Nat -> Nat 

When determining the type of Nat , we indicate its type! Just as the number 3 is of type Nat , in itself type Nat is of type Set . Set is the built-in type of Agda. However, understanding why this is necessary is not useful to us, and we can simply assume that all types are of type Set , although this is not true.
The Nat type has two constructors: Zero , which has no arguments, and Suc is a constructor with one argument. The number 3 thus looks like (Suc (Suc (Suc Zero))) , that is, something like (1 + (1 + (1 + 0))). However, Agda has built-in literals, so instead of a long Suc string, you can simply write 3.

A vector is a list of fixed length, defined as:
 data Vec (A : Set) : Nat -> Set where [] : Vec A Zero _::_ : {n : Nat} -> A -> Vec A n -> Vec A (Suc n) 

The type Vec , in contrast to Nat , has 2 parameters: type A and a value of type Nat . Due to the fact that the second parameter Vec is a value, not a type, Vec is a dependent type. The fact that the parameters are separated from each other by a colon, and not, say, the symbol -> is not a coincidence. In fact, what is written after the colon is the so-called indices, not parameters. However, for us this difference will not play a special meaning and we will call everything as parameters.
The first constructor builds an empty vector. In Agda, different sequences of characters can be used as names, so we can call the constructor [] .
The second constructor takes 3 arguments — the size of vector n , an element of type A , vector of size n , and returns a size vector (n + 1) . Note that for the first argument, as well as for the first parameter of the Vec type itself, we specified not just the type, but also the name - n: Nat . This is necessary because we refer to this value when declaring the third argument and the return value. In addition, the first argument is taken in braces. So in Agda are implicit arguments. The fact is that the value of n can be uniquely restored from the third argument of the constructor. Indeed, when we pass a vector, we know its size, it is indicated in the type. Therefore, in fact, the first argument we can not pass - Agda itself would guess what it should be. Thus, the constructor call _ :: _ looks either as _ :: _ {n} a someVec , or as _ :: _ a someVec . But that's not all! The underscore characters in the name of the constructor actually say that it is infix. Two explicit arguments can be written not after the name of the constructor, but instead of underscores, separated by a space. In summary, here is an example of creating a vector:
 v1 : Vec Nat 2 v1 = 1 :: (2 :: []) 

Note that we cannot write like this:
 v1 : Vec Nat 3 v1 = 1 :: (2 :: []) 

The constructor [] creates a vector of length 0, the first constructor :: is a vector of length 1, the second :: is a vector of length 2. We get Vec Nat 2 , but we need 3. The type inference system will not allow us to compile such a code.

Among other things, we need some operations on vectors:
 _++_ : forall {A mn} -> Vec A m -> Vec A n -> Vec A (m + n) take : forall {A m} -> (n : Nat) -> Vec A (n + m) -> Vec A n drop : forall {A m} -> (n : Nat) -> Vec A (n + m) -> Vec A m 

This is a concatenation of vectors, the operation “take the first n elements” and “throw away the first n elements”, similar to those in the Haskell list. A record like forall {A mn} means that, besides the fact that the arguments A , m and n are implicit, we still do not want to specify their type and ask Agda to define it on the basis of explicit arguments, which happens in an obvious way.

Bit Vector Type


The type of bit vector for which we will implement pattern matching will look pretty simple:
 data Bit : Set where O : Bit I : Bit Word : Nat -> Set Word n = Vec Bit n 

Word is a synonym for Vec Bit , using it we can declare a 32-bit vector as bits: Word 32 .
Before moving on to pattern matching, let's take a look at its features in the presence of dependent types.

Features of pattern matching in the presence of dependent types


Consider the following data type:
 data _==_ {A : Set} : A -> A -> Set where Refl : {x : A} -> x == x 

The type (x == y) is parameterized by two concrete values ​​of type A and, as it is easy to guess, represents the statement that the values ​​of x and y are equal. Having a single Refl constructor ensures that we cannot create values ​​of type (x == y) for different values ​​of x and y . For example:
 eq : (3 == 3) eq = Refl notEq : (3 == 4) notEq = ? 

In the first case, we need to construct a value of type (3 == 3) and we have a constructor that allows this to be done - Refl . We cannot construct a value of the type (3 == 4) - Refl does not fit the type, and there is no other constructors for the type == .
Now consider this function:
 f : (x : Nat) -> (y : Nat) -> (x == y) -> Nat 

What will pattern matching look like for the arguments of this function? Obviously, for the third argument, we will write Refl , since the type == has no other constructors. At the same time, we automatically find out that the first and second arguments are the same number! In such cases, in Agda it is necessary to clearly note this fact using the so-called dot pattern:
 fx .x Refl = x 

Such a record means that the second argument (which has the same name as the first with the assigned dot) is the same as the first argument. We cannot write, say, y instead of .x , because we don’t want to lose the equality information that we received after the comparison with Refl .

The conclusion is as follows: in languages ​​with dependent types, after pattern matching on any argument, we can get additional information about other arguments.

Views


Before building a View for our Word type, consider a simpler example to understand what we want to achieve.
Consider the type of list:
 data List (A : Set) : Set where [] : List A _::_ : A -> List A -> List A 

This is a regular list, the same as in Haskell, it is presented recursively, as “the first element and list”. We want to be able to present the same list as the “list and last item”. Let's create for this data type SnocView :
 data SnocView {A : Set} : List A -> Set where [] : SnocView [] _:::_ : (xs : List A) -> (x : A) -> SnocView (xs ++ (x :: [])) 

Here the operator ++ is used - it is concatenation of lists, as in Haskell. An example of a value of type SnocView :
 sx : SnocView (1 :: 2 :: 3 :: []) sx = (1 :: 2 :: []) ::: 3 

Note that the type SnocView is parameterized by the list for which the list and last element view is constructed.
Now let's write the snocView function, which for the transferred list builds a value of the SnocView type:
 snocView : {A : Set} -> (xs : List A) -> SnocView xs snocView [] = [] snocView (x :: xs) with snocView xs snocView (x :: .[]) | [] = [] ::: x snocView (x :: .(ys ++ (y :: []))) | ys ::: y = (x :: ys) ::: y 

There are a lot of things, let's start in order. The case with an empty list is obvious: an empty list gives a trivial SnocView . Further, if the list is not empty, we use the with construction (long indents here for readability only). This construct performs approximately the same tasks as the Haskell case of construct, that is, it is used to perform pattern matching in the function body. The patterns with which we associate the recursive call to snocView xs are indicated to the right of the vertical dashes. So, if the recursive call gave us the trivial SnocView , then the xs list was empty, so the SnocView for x :: [] is [] ::: x . In the second case, we get that SnocView for xs looks like ys ::: y , so in order to get SnocView for x :: xs you need to add x to the head ys .
It remains unclear only what is indicated to the left of the vertical dashes. As we discussed earlier, pattern matching in the presence of dependent types can give us additional information about other function arguments. In our example, pattern matching on a recursive call to snocView xs gives us information about the value of the xs list. It is this information that is indicated by the dot patterns to the left of the vertical bar. For example, in the second case, having learned that snocView xs gives us ys ::: y , we understand that the list xs is constructed as a concatenation of ys and a list from one element y :: [] .

Here is an example of using snocView to write a function that cyclically shifts the transferred list to the right:
 rotateRight : {A : Set} -> List A -> List A rotateRight xs with snocView xs rotateRight ._ | [] = [] rotateRight ._ | ys ::: y = y :: ys 

Here we used underscores. This can be done when we are not interested in the meaning that should be in place of underscores. Looks pretty brief and beautiful! This example essentially shows how to implement custom pattern matching in languages ​​with dependent types. Now let's see how to do the same for the Word type!

SplitView for Word


As in the previous example, we need 2 things: a special SplitView data type and the splitView function, which will build the SplitView value for a given Word value.
To begin with, we will need 2 additional functions to work with the type Vec :
 split : forall {A} -> (n : Nat) -> (m : Nat) -> Vec A (m * n) -> Vec (Vec A n) m split n Zero [] = [] split n (Suc k) xs = (take n xs) :: (split nk (drop n xs)) concat : forall {A nm } -> Vec (Vec A n) m -> Vec A (m * n) concat [] = [] concat (xs :: xss) = xs ++ concat xss 

SplitView type will look like this:
 data SplitView {A : Set} : {n : Nat} -> (m : Nat) -> Vec A (m * n) -> Set where [_] : forall {mn} -> (xss : Vec (Vec A n) m) -> SplitView m (concat xss) 

This type is parametrized by the number m and the vector. m indicates how many parts we want to break the vector. That is why the vector length is a multiple of m . A single constructor will allow us to write values ​​of this type like this:
 [ a :: b :: c :: d :: [] ] 

where a, b, c and d are vectors of length n.
The next step is the splitView function:
 splitView : {A : Set} -> (n : Nat) -> (m : Nat) -> (xs : Vec A (m * n)) -> SplitView m xs 

To implement it, you need to split the transmitted vector into m parts, each of length n , and transfer it to the constructor [_] :
 splitView nm xs = [ split nm xs ] 

However, such a definition is not accepted by the compiler. The point is that the constructor [_] returns a value of type SplitView m (concat xss) . In our case, xss is split nm xs , so the expression [split nm xs] is of type SplitView m (concat (split nm xs)) . In this case, the function signature indicates that it should return the type SplitView m xs , where xs is the third argument of the function. We all understand that the expressions xs and concat (split nm xs) are one and the same, because we know how the functions concat and split are arranged. Let's convince the compiler.
To begin with, let's rewrite the splitView body above in the following form:
 splitView nm xs with [ split nm xs ] splitView nm xs | v = v 

This definition is not accepted by the compiler for the same reasons as the previous one: v has the type SplitView m (concat (split nm xs)) , but must be SplitView m xs . Continue to change the definition:
 splitView nm xs with concat (split nm xs) | [ split nm xs ] splitView nm xs | ys | v = v 

The with construction can be used for pattern matching by several expressions at once, which are separated from each other by vertical dashes. In addition, using with produces an effect called generalisation. It manifests itself in the fact that the types of function arguments and the type of expression that we need to construct can vary depending on the expressions we use for pattern matching with with. For example, if we have a function
 fxy with e fxy | p = ... 

then, as a result of generalization, all occurrences of the expression e in the types of arguments x and y , as well as in the type of expression that we need to construct in the body, will be replaced by p . And if we have several expressions in with
 fxy with e1 | e2 | e3 fxy | p1 | p2 | p3 = ... 

then, among other things, all occurrences of e1 in p2 and p3 are replaced by p1 , occurrences of e2 in p3 are replaced by p2 .
In our definition of splitView , by adding concat (split nm xs) to with, type v will change from SplitView m (concat (split nm xs)) to SplitView m ys . Now we need to prove that xs and ys are the same. To do this, we need the following function:
 splitConcatLemma : forall {A} -> (n : Nat) -> (m : Nat) -> (xs : Vec A (m * n)) -> concat (split nm xs) == xs 

According to the transmitted vector, it constructs the value of the familiar type == , which indicates the equality of the necessary expressions concat (split nm xs) and xs . Using this function, we get the following version of splitView :
 splitView nm xs with concat (split nm xs) | [ split nm xs ] | splitConcatLemma nm xs splitView nm xs | ys | v | eq = v 

Here, no types have changed, but eq has the type ys == xs due to the same generalization effect on the first expression in with. Now we have everything to make the function work. Perform pattern matching on the value of eq :
 splitView nm xs with concat (split nm xs) | [ split nm xs ] | splitConcatLemma nm xs splitView nm xs | .xs | v | Refl = v 

Hooray! Now everything works!
I did not give the definition of the splitConcatLemma function here . This is not critical for our conversation on the View template, besides, I am sure that I already loaded you! But brave warriors who have read this far, the end is on the horizon.
Finally, the implementation of the swapAB function through custom pattern matching using splitView :
 swapAB : Word 32 -> Word 32 swapAB xs with splitView 8 4 xs swapAB ._ | [ a :: b :: c :: d :: [] ] = concat (b :: a :: c :: d :: []) 


Conclusion


After reading the article, you might have a question: is it possible to use the same approach in languages ​​without dependent types? And indeed, all we did was set the data type, which allowed us to represent the source type in the form we needed, and wrote a function that constructs the value of the target based on the input value of the source type.
The difference in the use of this approach in languages ​​without dependent types is that we lose touch with the original value. For example, the type of SnocView depends on the value of the list for which we build it, that is, the relationship between the original value and the View is maintained at the type level. Due to this, when executing pattern matching on SnocView xs values, we learn information about how the original xs list looks. Without dependent types to get this information will not work.
The second difference that follows from the first one is that without dependent types, the function that builds the View will be too general. For example, the snocView function has the type:
 snocView : {A : Set} -> (xs : List A) -> SnocView xs 

It is clear from it that the function body cannot be completely arbitrary - it is limited by the fact that SnocView depends on xs . If we did not have dependent types, then this function could be of type
 snocView : List A -> SnocView 

and, let's say, just returning an empty list at any input, which makes it rather useless.

Therefore, I will finish my post with the words with which the article begins The Power Of Pi: dependent types!

References:
Nicolas Oury, Wouter Swierstra "The Power Of Pi"
With documentation
Sources

Source: https://habr.com/ru/post/277989/


All Articles