Zoo of Algebraic Data Types

In this article we will try to consider the whole variety of Algebraic Data Types.

It must be said, the task is rather unaffordable, and it is not very easy to understand a person if he has not dealt with Algebraic Types before -
ATD was first used in the Hope language, but they gained their main popularity thanks to ML languages, such as Standart ML, OCaml, F #, and Haskell.
Nowadays, ATD are to some extent supported in much more languages: Scala, Rust, Nemerle, Racket, ...
ADT is a universal data type. With it, you can present most of the data types.
ATD are called algebraic, because they can be represented as a certain algebraic composition of the types of its components. This knowledge gives its advantage: understanding the properties of an algebraic composition, it is possible to calculate what type is necessary for the construction of certain data.
We will consider types based on the Haskell language, but this with light modifications can be achieved in other languages with ADT support.

It should be noted that the theory of types developed in the 20th century by Russell, Church, Martin-Löf became so popular that now (already in the 21st century) a theory of the Homotopic Theory of Types has appeared that has set about to explain the foundation of mathematics.

Hott

If anyone is interested in the theory, you can read free fresh work in English. Homotopy Type Theory: Univalent Foundations of Mathematics

Our task is to look at Algebraic Data Types. Often these ATD is called boxed or wrapped. They are so named because they look like data packed in a box. The most interesting thing is that even an implementation means not storing data, but pointers. In addition to the ease of implementation, it gives effective mechanisms of laziness, which Haskel has.
In fairness, I note that Haskell allows you to create non-lazy, including unwrapped data, which helps in processing high-loaded data volumes.
In Haskell, there are 2 ways to create an ADT using the data or newtype declaration. The difference between them is that the newtype de facto not wrapped, which means cheaper (in terms of machine resources), and therefore more profitable, but with the help of it you can write only a narrow circle of ADT, so it is not suitable in general.

Types of data implemented through primitives

In Haskell there are not many of them, first of all these are data types such as
Int , Integer, Float , Double , Char , ...
The point is not that it is impossible to make these types truly (through ADT), just the computer already knows how to work with numbers, and it's a sin not to use it.
I want to note that all types are written with a capital letter, this is due to the fact that all type and data constructors in Haskell are written exclusively with a capital letter.

Zero data types

Most imperative languages are built in such a way that it is convenient to create programs from the bottom up through hierarchical complexity.
In Haskell, it's very easy to write code from top to bottom. In this connection, it became necessary to have empty data types (without implementations), since it is known that this data type will do, but it has not yet been decided how to fill it. There is an extension that allows you to do this.
')
These are not real, pseudo-zero, empty data types.

 data S data T a data K ab

The syntax, as we see it is simple to ugliness.
Here we created 3 types of data, the second of which is parametric. We see that a is written with a lowercase letter, which means it cannot be a constructor. The output is a parameter in which you can substitute any type, that is, you can have the following types: T Int, T Char , TS , T (T Double) and even T (T ( T String)) .
We also see that the constructor stands at the beginning of the type, and then the data is packed in a box.
You ask why they are needed?
For example, we can write the following valid function code:

 doubleT :: T a -> (T a, T a) doubleT t = (t, t)

Of course, in most cases, you cannot write all the functions without internal knowledge about the data, but you can leave some of the functions for later, doing other tasks, such as this:

 foo :: S -> Int -> T a foo = undefined nextT :: S -> Int -> (Int, T a) nextT si = (i+1, foo s (i+1))

This program will be compiled and checked for errors, despite the fact that neither the data nor the part of the functions are defined.

This zero data type looks unusual enough for an untrained eye, but we can understand it.
It was found relatively recently, is still not in the Platform (so that it can be used out of the box), and still little used.

Void

 newtype Void = Void Void

Here we see the recursive type (we'll talk about this later), first comes the type name - Void , then = , then the constructor of this type is Void and the parameter, which is the Void type.
If we rename, it will not change anything except our understanding:

 newtype VoidData = Void VoidData

This data type is infinitely self-centered.

Single data types

Single data types provide nothing but the knowledge that it is.
Of the common data types used

 data () = () --X

hereafter comment

  -- X

data will be shown so defined, but they can not be determined by the very fact that invalid characters are used.
In almost all programs, a signature is used:

 main :: IO ()

because the program must return something. Therefore () is the most popular type when you need to return anything.
Or, here's another well-known single type:

 data Unit = Unit

the same, only ordinary constructors are used.
I must say that in these types there is no recursion, that is, we can write:

 data UnitData = Unit

The fact is that the type namespace does not overlap with the type constructor space, so the compiler perfectly understands where recursion is and where it is not.

Types of works

Types of works are named so that they can be represented as a product of several types that make it up.
Very popular data types are used everywhere.
The most famous ones are tuples or tuples. Haskell uses special syntactic sugar for this:

 data (,) ab = (,) ab --X (a, b) data (,,) abc = (,,) abc --X (a, b, c) data (,,,,) abcd = (,,,,) abcd --X (a, b, c, d)

We can write the same thing with ordinary constructors:

 data Pair ab = Pair ab data Triple abc = Triple abc

Everywhere where you need to store more than one given in the data - this is the type of works.

Summary Data Types

Summary data types are named because they summarize the data types that compose them.
For example, if you add single data types, you get an enumeration.
I present one of the most famous listings:

 data Bool = False | True

Also known is the Ordering type, which is used in comparisons.

 data Ordering = LT | EQ | GT

If we move on to more complex ADDs, we need to recall the null-abel type:

 data Maybe a = Nothing | Just a

Data that has a field for the result when it is not there, so to say null from many programming languages.

 (</>) :: Int -> Int -> Maybe Int _ </> 0 = Nothing a </> b = Just (a `div` b)

If you need to learn more about the error, use an even more complex data type:

 data Either ab = Left a | Right b

data type, where there is either a “left” result (description of the error or the error itself) or “correct” result.
This type of data is also used for binary selection.

 type IdPerson = Int type ErrorMsg = String personChecker :: Person -> Either ErrorMsg IdPerson personChecker p = if age p < 0 then Left "this guy is not born yet!" else if null $ first_name p then Left "this guy is unnamed!" else Right $ registeredId p

You can make more exotic data types, for example:

 data Both ab = None | First a | Second b | Both ab

here we either have no result, or 1 result, or 2nd, or 1st and 2nd.

Recursive data types

In Haskell, you can easily create recursive data types.
The most well-known, using the syntactic sugar type are lists:

 data [] a = [] | a : [a] --X

However, it can be easily rewritten with the words:

 data List a = Nil | Cons a (List a)

I want to note that special characters can be used as constructors, where the first character is necessarily a colon (:) , then they use infix notation.
Want to build a tree? No problem.
For example, a simple binary tree:

 data Tree a = Leaf a | Branch (Tree a) (Tree a) data Tree2 a = Empty | Branch2 (Tree2 a) a (Tree2 a)

Or here's another, natural numbers:

 data Nat = Zero | Succ Nat

If you have not yet opened a spoiler of the zero data type (or opened, but did not understand) - it's time to open and read and understand that the Void elementary infinitely recursive data type.

Power data types

Do not think that only data can be recorded in ADT. Functions can also be easily written to ATD: power types (the general type has the power of raising one type of data to the power of another):

 data Fun ab = Fun (a -> b) newtype ContT rma = Writer ((a -> mr) -> mr)

If we use ADT works, where the data will be interspersed with functions - this is nothing but records, and very similar structures to objects.

Phantom data types

Sometimes you have to create a rather unusual behavior, and you have to invent types that do nothing but pretend they do. Phantom types are used for this in particular. One of the most famous types is:

 data Proxy a = Proxy

As you can see, the type header says that the type is parametric, but in reality it is a dummy.
These data types are often used as universal plugs, since they are easily converted from one data type to another.

 proxyonverter :: Proxy a -> Proxy b proxyonverter Proxy = Proxy

Records

The type of works is so often used that there is a special syntactic sugar, which makes it easier and more convenient to work with it (with built-in getters and setters). Such types are called records.
Let we have a type

 data Sex = Female | Male data Person = Person String String Int Int Sex (String -> String)

much more convenient to rewrite to:

 data Sex = Female | Male data Person = Person { lastName :: String , firstName :: String , age :: Int , socialId :: Int , sex :: Sex , greeting :: String -> String }

Inserts or Wrappers

Inserts are used where it is necessary either to hide the implementation of the type, to protect from outside interference, or to impart other properties than the maternal type.
It is written in a similar way (with the syntax of records):

 newtype Wrapper a = Wrapper { unwrapper :: a } newtype Dollar = Dollar Int

Existential Data Types

Existential data types are named like this because of the existence quantifier ∃.
The paradox is that there is no such quantifier in Haskele.
But there is a quantifier of universality ∀. But these quantifiers can easily be transformed into each other.
If to be completely accurate, then each existentially-quantified type of rank n+1 can be transformed into a universally-quantified type of rank n .

 data HeteroData = forall a. HeteroData a heteroList :: [HeteroData] heteroList = [HeteroData 3.7, HeteroData "message", HeteroData True]

As you can see, they were able to create a heterogeneous list, despite the fact that it is homogeneous.
But something looks like an object:

 data Counter a = forall self. NewCounter { _this :: self , _inc :: self -> self , _display :: self -> IO () , tag :: a }

We cannot work directly with functions written with an underscore.
The truth is to work with similar objects does not seem to work with the PLO:

Record with existential variable

 inc :: Counter a -> Counter a inc (NewCounter xidt) = NewCounter { _this = ix, _inc = i, _display = d, tag = t } display :: Counter a -> IO () display NewCounter{ _this = x, _display = d } = dx counterA :: Counter String counterA = NewCounter { _this = 0, _inc = (1+), _display = print, tag = "A" } counterB :: Counter String counterB = NewCounter { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" } main = do display (inc counterA) -- prints "1" display (inc (inc counterB)) -- prints "##"

Generalized Algebraic Data Types (GADTs)

Generalized ADT differ from the usual ones in that they cut and specialize the resulting type.
Haskell uses a “functional” record of this data. Let's rewrite the simple data type in a new syntax:

 data Maybe a = Nothing | Just a data Maybe a where Nothing :: Maybe a Just :: a -> Maybe a -- -------------------- data List a = Nil | Cons a (List a) data List a where Nil :: List a Cons :: a -> List a -> List a

We see that the final data type for all is the same: Maybe a (or List a ). OATD allow you to have different total data types.
If we have an abstract tree:

  data Term a where Lit :: Int -> Term Int Succ :: Term Int -> Term Int IsZero :: Term Int -> Term Bool If :: Term Bool -> Term a -> Term a -> Term a Pair :: Term a -> Term b -> Term (a,b)

it is easy to build a function for computing a similar tree:

  eval :: Term a -> a eval (Lit i) = i eval (Succ t) = 1 + eval t eval (IsZero t) = eval t == 0 eval (If b e1 e2) = if eval b then eval e1 else eval e2 eval (Pair e1 e2) = (eval e1, eval e2)

Parametric data types with limited parameters

Sometimes it really hurts to restrict parametric data types. They are cut off by a restriction of the (kind) type.
The view is essentially a type type, for example

 Just 3 :: (Maybe Int :: *)

Here is the data with a parameter limit:

 data F (a :: * -> *) where ...

Basically, these types are required for very high levels of abstraction.

Vector type

Vector with natural numbers as length

 data Ze data Su n data Vec :: * -> * -> * where Nil :: Vec a Ze Cons :: a -> Vec an -> Vec a (Su n)

Conclusion

Algebraic Data Types is a very simple, versatile, elegant, easily extensible and powerful tool for creating custom data types to meet the many needs of programmers.

Source: https://habr.com/ru/post/207126/

All Articles