Haskell Polymorphia and Classes

Hello, dear reader!

Today we will talk about polymorphism of functions and classes in Haskell, how to use them and what they are for. We all used the method of polymorphism, for example, in the Java language. And how can this be applied to Haskell? Haskell is not an object-oriented language, but it still has classes. Although classes in Haskell have some properties of object-oriented language classes, they are more abstracted and much more powerful. Drawing a Java analogy further, classes in Haskell are nothing else as interfaces — only function declarations are written to a class, and the implementations of these functions will be made later.
I would like to express my gratitude to the user Darkus for reading and correcting all the shortcomings and inaccuracies in the article, as well as for useful advice and assistance in writing.

Polymorphism of functions

In Haskell, there are two types of polymorphism - parametric and special (in English they call the term “ad hoc”, which comes from the Latin language).
')

Parametric

If you have ever programmed in Haskell, then you will certainly meet examples of parametric polymorphism in the functions length, head, teal, curry, uncurry, map, filter, ... What unites all these functions? Let's look at the implementation of some of them:

 length :: [a] -> Int length [] = [] length (_:xs) = 1 + length xs head :: [a] -> a head (x:_) = x tail :: [a] -> [a] tail (_:xs) = xs curry :: ((a,b) -> c) -> (a -> b -> c) curry fxy = f (x,y) uncurry :: (a -> b -> c) -> ((a,b) -> c) uncurry g (x,y) = gxy

The first three functions have an input parameter of type a , and curry and uncurry have b and c . Instead of a specific data type ( Int, Bool, Char, ...), typing is used. Java example:

 public class Test<A> {...}

Here, too, A is used - an unknown data type that will be determined at compile time. Polymorphism is used when it is unknown what type of data the parameter will be, but the operations that are performed on it (the same with it) will be known.

All identifiers of type variables in Haskell must begin with a small letter (for example, a, abc, aA101 ), and all specific types (constructors) begin with a capital letter (for example, String, Int, Node ).

The a parameter can take any type, be it Int, String, or even functions (for example, length [f, g, h] , where f, g, h are functions that have the same type). It is worth noting that type b can also accept any type, including the type of parameter a .

The type of the function in the GHCi interpreter (and in the Hugs) can always be found with the help of the command :t , for example:

 Main>:t length length :: [a] -> Int

Ad hoc

A synonym for special polymorphism is the term "overloaded functions." This is a weaker type of polymorphism than parametric. Take for example the operator (function) of addition (+). Expressions of this kind
(+) 2 3 ->> 5
(+) 2.5 3.85 ->> 6.35
different from expressions
(+) True False
(+) [1,2,3] [3,2,1]
by the fact that in the first case numerical values were used, and in the second values of type Bool and [Int]. The addition operator is not defined for non-numeric types. This is because this function is not of type.
(+) :: a -> a -> a
and such
(+) :: Num a => a -> a -> a . That is, a restriction is imposed on the type of input (and output) data.

The constraint imposed on type a : Num a says that type a must be an element of class Num. Such type classes are very important in Haskell, as they add additional protection against programming errors, and can also reduce the amount of written code at times. We will see this in the following example.

The following types of binary trees are given:
A binary tree that can store any type of data

 data Tree1 a = Nil | Node1 a (Tree1 a) (Tree1 a)

A binary tree that can hold two elements, and each branch ends in a “leaf” with one element (not Nil):

 data Tree2 ab = Leaf2 b | Node2 ab (Tree2 ab) (Tree2 ab)

A binary tree that can store data of type String and also ends in a sheet:

 data Tree3 = Leaf3 String | Node3 String Tree3 Tree3

And let's say there are two more kinds of lists:
usual

 type Lst a = [a]

and type “rope”

 data Rope ab = Nil | Twisted b (Rope ba)

Having all these data structures, I would like to write the size function, which, regardless of the data structure, would output either its depth (for trees), or length (for lists), or size - in one word. A naive solution would be to write for each its own function:

Hidden text

 sizeT1 :: Tree1 a -> Int -- -  sizeT1 Nil = 0 sizeT1 (Node1 _ lr) = 1 + sizeT1 l + sizeT1 r sizeT2 :: (Tree2 ab) -> Int -- -  sizeT2 (Leaf 2 _) = 1 sizeT2 (Node2 _ _ lr) = 2 + sizeT2 l + sizeT2 r sizeT3 :: Tree3 -> Int --    String sizeT3 (Leaf3 m)= length m sizeT3 (Node mlr) = length m + sizeT3 l +sizeT3 r --   : sizeLst :: [a] -> Int sizeLst = length sizeRope :: (Rope ab) -> Int sizeRope Nil = 0 sizeRope (Twisted _ ls) = 1 + sizeRope ls

Now, if any other data structure appears, you will have to create a new function that will be suitable only for this structure. And we would like one function to get the size of any data structure. As the (+) function had a restriction with the class Num, so we need to make the restriction with the Size class, and for this we need to create it first:

 class Size a where size :: a -> Int

It remains only to make instances (instances) of this class for specific values of a (= for specific data types):

 instance Size (Tree1 a) where size Nil = 0 size (Node1 _ lr) = 1 + size l + size r instance Size (Tree2 ab) where size (Leaf2 _) = 1 size (Node2 _ _ lr) = 2 + size l + size r instance Size Tree3 where size (Leaf3 m) = length m size (Node3 mlr) = length m + size l + size r instance Size [a] where size = length instance Size (Rope ab) where size Nil = 0 size (Twisted _ ls) = 1 + size ls

Done! Now if we write in GHCi :t size , we will see size :: Size a => a -> Int . We got what we wanted:

 size Nil ->> 0 size (Node1 "foo" (Node1 "bar" Nil Nil) Nil) ->> 2 size (Leaf 2 "foo") ->> 1 size (Node3 "foo" (Node3 "bar" (Leaf3 "abc") (Leaf "cba")) (Leaf "tst")) ->> 15 --3*5 size [1..5] ->> 5 size (Rope 2 (Rope 'a' (Rope 5 Nil))) ->> 3

Each instance of the Size class implemented the size function, which is applicable only to values of a certain type (= data structures). In this case, this function, like other predefined operators (+), (*), (-), is overloaded.

But there are also disadvantages to such a decision. For example, we would like to know the number of items in the pair list, that is,

 size [(1,2), (3,4)] ->> 4 size [('a','b'), ('c','d'), ('e','f')] ->> 6

Obviously it would do the following:

 instance Size [(a,b)] where size = (*2) . length

But here we have a problem, because of that we already have an instance for a regular list ( instance Size [a] where ), we can no longer use another definition, because, as mentioned earlier, type a can be any type, including (b,c) , that is. [a] == [(b,c)] To solve this problem, you can use the Overlapping Instances (English overlapping ekzeplyary class). This solution has its drawbacks (importing one of the modules can change the value of the program, can cause confusing errors, and so on).
Now a little more about classes.

Classes

In the previous example, we passed the Size class, but the incomplete class declaration in Haskell: (tv = type variable):

 class Name tv where    tv

Incomplete, because restrictions on tv can also be entered in the class declaration (for example, tv must be an element of the Ord class). Signatures can be any number, but each of them must include (as an input and / or output parameter) the variable tv.
Typical classes are collections of types for which special functions are defined. Here are some (some of the most important) classes in Haskell:

Eq - class for the test for equality (inequality) of two data types (operations ==, / =)
Ord - class for determining the order of types, that is, which element is larger, which is smaller (operations>,> =, <, <=, min, max ...)
Enum - class for types whose values can be “counted” (for example, [1..10])
Bounded - the class is also used for class types Enum. Used to name the lower and upper bound type.
Show - class for types whose values can be converted to a string, (= can be represented as characters)
Read - class for types whose values can be converted from string

Only these classes can automatically inherit any data type (= placed in the deriving section). There is also a relationship between classes. For example, if we know how to compare two elements (class Ord), then we should be able to determine in advance whether one element is equal to another (class Eq). After all, to implement the operator ( >= ) you need to have an implemented operator (==). Therefore, it can be said that the class Ord depends (inherits) on the class Eq. Here is a more detailed class dependency diagram:

Let's look at the equivalence Eq more detail. As previously mentioned, this class should have two functions (==) and (/ =):

 class Eq a where (==), (/=) :: a -> a -> Bool --   (==)  (/=) ,     1  x /= y = not (x == y) x == y = not (x /= y)

From the code it is clear that for any instance of the class Eq, it suffices to implement one of the two functions. For example, for the Bool type (two types are equal only when they are both True or False) this is done as follows:

 instance Eq Bool where (==) True True = True (==) False False = True (==) _ _ = False

As you can see, the case of True False and False True (as well as the last line) is not necessary to write, because inequality is the inverse operation of equality. If before we had so that the type is another type of concretization (for example, [Int] is an instance of type [a]), then the type may be an instance of a class (for example, Bool is an instance of class Eq). By the same analogy, we can write the equality functions of those binary trees that we used above. For example, for Tree2:

 instance (Eq a) => Eq (Tree2 a) where --   a,      (==) (Leaf2 s) (Leaf2 t) = (s==t) (==) (Node2 s t1 t2) (Node2 t u1 u2)= (s==t) && (t1 == u1) && (t2 == u2) (==) _ _ = False

For (Tree3 ab) we should already impose a condition on both a and b , that is,
instance (Eq a, Eq b) => Eq (Tree3 ab)

All that to the left of the symbol => is called a context, all that to the right of this symbol must be either a base type (Int, Bool, ...), or type constructors (Tree a, [...], ...). The operator (==) is an overloaded function (ad-hoc polymorphism).

Equality (==) is a type-dependent property that requires implementation for each type (such as the implementation of equality of binary trees). I would like to be able to compare the two functions in this way:

 (==) fac fib ->> False -- fac -  ,  fib -   (==) (\x -> x+x) (\x -> 2*x) ->> True (==) (+2) (2+) ->> True

This would require Haskell to write such code:

 instace Eq (Int -> Int) where (==) fg = ... -- f  g  -

Is it possible to write such a function that would produce the result of the equality of any two functions (True or False)? Unfortunately, from the thesis of Church-Turing it follows that the equality of any two functions cannot be determined. There is no algorithm that, for any two given functions, always and in a finite number of steps would decide whether the two functions are equal or not. Such an algorithm cannot be programmed in any programming language.

Instead of implementing your function for equality, for example, binary trees, you can always put those classes that are inherited by this type automatically after the keyword deriving . That is, we could quite calmly write:

 data Tree a = Nil | Node a (Tree a) (Tree a) deriving Eq

This allows us not to write our own instance of the class Eq ( instance Eq a => Eq (Tree a) where ... ). But for any function, then you have to impose a condition on the variable a , that this variable is an element of the class Eq. Sometimes, you still have to write these functions yourself, since the “automatic” comparison does not always work the way we would like. For example for a binary tree

 data Tree3 ab = Leaf3 bl | Node3 ab (Tree3 ab) (Tree3 ab) deriving Eq

the following function will be implemented (automatically):

 instance (Eq a, Eq b) => Eq (Tree3 ab) where (==) (Leaf3 q) (Leaf3 s) = (q==s) (==) (Node3 _ q t1 t2) (Node3 _ s u1 u2) = (q==s)&&(t1==u1)&&(t2==u2) (==) _ _ = False

As we can see, the first argument of Node3 was omitted, which we would not want in some cases.

Conclusion

Informal difference between the two types of polymorphism:

parametric - the same code regardless of type.
ad-hoc is a different code, although the same function names.

Source: https://habr.com/ru/post/166353/

All Articles