By mike vanier
In the last article, I talked about two fundamental monadic operations of the
Monad type class: the binding operator (bind,
>> = ) and the
return function. In this article I’ll finish with the definition of the
Monad type class and talk about monad laws.
Full Monad Type Class
Let's take a look at the whole definition of the
Monad type class:
class Monad m where
( >> = ) :: m a -> ( a -> m b ) -> m b
return :: a -> m a
( >> ) :: m a -> m b -> m b
fail :: String -> m a
')
We see the familiar:
>> = operator and the
return function with the same types, but besides them there is also the
>> operator and the
fail function. What do they mean?
The
fail function is initially a very primitive way of reporting an error. It is called when the operator
>> = cannot bind a value of type
a and a function of type
a -> mb due to matching errors. I don’t want to go into the details of this mechanism right now because it’s boring; see
the Haskell
documentation if you need it. In most cases, do not worry about
fail .
Operator
>> a little more interesting. He has this type:
( >> ) :: m a -> m b -> m b
This operator is a monadic sequence operator. In particular, it is a variant of monadic application (
>> = or “bind”), which discards an unpacked value of type
a before performing an “action” of type
mb . It is defined as follows:
mv1 >> mv2 = mv1 >> = ( \ _ -> mv2 )
We can see here that any value unpacked from the monadic value
mv1 is rejected, and then the final monadic value
mv2 is returned . The operator is useful when the type of the unpacked value is
() , that is, is an empty type. A good example is the
putStrLn function:
putStrLn :: String -> IO ( )
Imagine that you want to print two lines, one after another, with end-of-line characters after each. It is possible to
putStrLn "This is string 1." >> putStrLn "This is string 2."
And why does this work? Let's look at the types:
( putStrLn "This is string 1." ) :: IO ( )
( putStrLn "This is string 2." ) :: IO ( )
That is, the operator
>> combines two monadic values ​​of type
IO () into one resulting monadic value of type
IO () . Let's take the operator
>> and specialize it for our case:
( >> ) :: m a -> m b -> m b
If
m is
IO , and
a , and
b -
() , we get:
( >> ) :: IO ( ) -> IO ( ) -> IO ( )
According to the record, we can say that the operator
>> probably performs two “actions” in a row - printing a string.
Now a more complex example. We need to read a line of text from the terminal and print it twice. We can do it like this:
readAndPrintLineTwice :: IO ( )
readAndPrintLineTwice = getLine >> = ( \ s -> ( putStrLn s >> putStrLn s ) )
Due to operator priorities, the record can be left without brackets:
readAndPrintLineTwice :: IO ( )
readAndPrintLineTwice = getLine >> = \ s -> putStrLn s >> putStrLn s
So what does this mean?
getLine is a monadic value (“action”) that reads a line of text from the terminal. The
>> = operator "unpacks" this string from a monadic value and connects it with the name
s (because
\ s -> putStrLn s >> putStrLn s is a monadic function, the second argument of the
>> = operator). Then the string, called
s , is used by the monadic value
putStrLn s >> putStrLn s , which it prints out two consecutive times.
If this seems mysterious to you, it is not your fault. Something strange is happening here, in the depths, but as long as I have not talked about the monad of the state, I cannot explain it. But now you must be able to trace
what is happening, even if you still do not fully understand
how it is happening.
Right now, I'm going to go back a little and look at the monad laws. They play a large role in the use of the
>> = operator and the
return function for each specific monad. After that we turn to more practical materials.
Three Laws of Monadality
Many important laws go in groups of three: three mechanical laws of Newton, three laws of thermodynamics, Three Laws of Asimov Robotics, three Keplerian laws of planetary motion, and so on and so forth. The monads are no different in this, well, of course, except that the “Three Monad Laws” is more important than anything else. ;-)
For the
>> = operator and the
return function to be correct for a particular monad, they must have the correct types of this monad. So, for example, the definitions
>> = and return for the
Maybe monad contain its type:
( >> = ) :: Maybe a -> ( a -> Maybe b ) -> Maybe b
return :: a -> Maybe a
And for
IO monads contain type
IO :
( >> = ) :: IO a -> ( a -> IO b ) -> IO b
return :: a -> IO b
However, this is not enough. These functions / operators are also required to satisfy three “monad laws”. Monadic laws are actually very simple; they are designed to ensure that the monadic composition will work in a predictable way. First I will give you a “nice” version of the monad laws, and then I will show the (ugly) way as they are usually described. (Tell me, thank you: the “nice” option is much easier to understand.)
Nice version
Here is a pleasant definition of three monad laws expressed in terms of a monadic composition (remember that the operator
(> =>) is a monadic operator of composition of functions):
1 . return > => f == f
2 f > => return == f
3 ( f > => g ) > => h == f > => ( g > => h )
What do these laws tell us?
Laws 1 and 2 say what a
return should be: it is a unit (neutral element) for a monadic composition of functions (the first rule states that return is a left unit, and the second rule is a right one). In other words, the composition of the monadic function
f and
return (in any order) simply returns the function
f . Analogues can be considered 0 - the neutral element for the addition function of integers, and 1 - the neutral element for the integer multiplication function; in each case, the neutral element connected to the normal value with the help of the corresponding function simply returns this value back.
Law 3 states that the monadic function of composition is associative: when we want to combine three monadic functions (
f ,
g ,
h ), it does not matter which two we connect first. This is analogous to the fact that addition and multiplication are also associative when applied to integers.
Do not these laws seem vaguely familiar to you? Take a look at the relevant "laws", which satisfies the usual function of the composition:
1 . id f == f
2 f . id == f
3 ( f . g ) . h == f . ( g . h )
where
id is a neutral element, one. Got a resemblance? Composition of a function with a unit on the left or on the right will again give the same function, and the function of the composition is associative. The monadic function of the composition must be associative, and
return must be the monadic equivalent of a single function, so that the behavior of the monadic composition is as predictable as the behavior of a regular composition.
What is the significance of these laws in terms of the programmer? Since we want our monads to behave intelligently, our definitions of
return and
>> = must satisfy these laws. We will soon find out how to check the correctness of the definitions
>> = and
return . [Note that monad laws are expressed in terms of the
> => operator, not the
>> = operator, but we will see a version using
>> = , - it is equivalent.]
However , there is a catch: Haskell
does not check for monad laws! The only thing that is checked is that the definition types
return and
>> = are correct. Whether the laws are enforced or not, the programmer must verify.
Many people ask: “Why Haskell
cannot verify the monad laws?” The answer is simple: Haskell is not powerful enough yet! In order to get a sufficiently powerful language that would prove the correctness of monad laws, you need something like a theorem prover (teorem prover). Proofs of theorems capture the imagination, and they, perhaps, are the future of programming, but they are much more complicated than traditional programming languages. If you are interested, there is a respected proof of the Coq theorem, it is available
here . But in Haskell, the programmer himself must take care that the monad he wrote does not violate the monad laws.
Ugly version
The problem with the nice version is that the
> => operator is not defined directly in the
Monad type class; instead, the
>> = operator is defined, and the
> => operator is derived from it, as I showed above. So if we restrict definitions to operators
>> = and
return , we need monadic laws containing only
return and
>> = . And in this form, they are presented in most books and documentation on monads in Haskell, despite the lower intuitiveness than shown in the previous section.
In terms of the
>> = operator and
return functions, monadic laws look like this:
1 . return x >> = f == f x
2 mv >> = return == mv
3 ( mv >> = f ) >> = g == mv >> = ( \ x -> ( f x >> = g ) )
where the types of different values ​​are:
mv :: m a
f :: a -> m b
g :: b -> m c
for some types
a ,
b ,
c and some monad
m .
Derive the ugly version of the monad laws from the pleasant version
Let's have fun and try to get the ugly version of the monad laws out of the nice version. In our calculations, we will need a definition of a monadic composition, which we have analyzed above:
f > => g = \ x -> ( f x >> = g )
Act 1:
return > => f == f
\ x -> ( return x >> = f ) == \ x -> f x
return x >> = f == f x - QED ("What was required to prove")
Note that
\ x -> fx is the same as just
f .
Act 2:
f> => return == f
\ x -> (fx >> = return) == \ x -> fx
fx >> = return == fx
let mv == fx
mv >> = return == mv - QED
Act 3:
( f > => g ) > => h == f > => ( g > => h )
\ x -> ( ( f > => g ) x >> = h ) == \ x -> ( f x >> = ( g > => h ) )
( f > => g ) x >> = h == f x >> = ( g > => h )
( \ y -> ( f y >> = g ) ) x >> = h == f x >> = ( \ y -> ( g y >> = h ) )
- Calculate (\ y -> (f y >> = g)) x we ​​get: (f x >> = g)
( f x >> = g ) >> = h == f x >> = ( \ y -> ( g y >> = h ) )
- Let mv = f x, then:
( mv >> = g ) >> = h == mv >> = ( \ y -> ( g y >> = h ) )
- Replace g with f, h with g:
( mv >> = f ) >> = g == mv >> = ( \ y -> ( f y >> = g ) )
- Replace y with x in the right expression and we get:
( mv >> = f ) >> = g == mv >> = ( \ x -> ( f x >> = g ) ) - QED
At the calculation step
(\ y -> (fy >> = g)) x, we simply apply the entire function (
\ y -> ... ) to the argument
x ; at the same time,
y is replaced by the variable
x in the function body (it is indicated by ellipsis (...)), and the function body is returned as a result. In the functional programming language Lingo, this operation is called
beta reduction . {1: In this case, we are talking about the section of mathematics called lambda calculus, which describes, among other things, beta-reduction.} Beta-reduction is the main way to calculate functions. The last step, where
y is replaced by
x , is correct because, why the following two functions:
\ x -> f x
\ y -> f y
Is the same thing (the name of the formal argument does not matter). In the functional language Lingo, we would say that the two functions are
alpha equivalent . Other steps you must have understood.
What is the idea?
Monad laws can sometimes be used in code, replacing a long expression with a shorter one (for example, instead of
return x >> = f, you can simply write
fx ). However, we will show in the following articles that the main use of monad laws is that they allow the derivation of
return and
>> = definitions for specific monads.
At the end of this article I want to show you the neat syntactic form of a record, with which monadic code looks much nicer.
Do- abstract
Recall the
readAndPrintLineTwice function defined above:
readAndPrintLineTwice :: IO ( )
readAndPrintLineTwice = getLine >> = \ s -> putStrLn s >> putStrLn s
It has one virtue: it is written in one line. The disadvantage is not the most readable line in the world. The designers of the Haskell language have noticed that monadic definitions are often difficult to read, and have come up with a really nice syntactic sugar, with which the definitions are more readable.
This syntactic sugar is based on the observation that a huge number of operations in monadic code are recorded in two forms:
- Form 1.
- mv :: m a
- f :: a -> m b
mv >> = \ x -> mf x
- Form 2.
- mv :: m a
- mv2 :: m b
mv >> mv2
The notation was designed with the intention of making these two forms easy to read. It starts with the
do keyword, followed by some monadic operations. So these two of our examples will be written in
do- notation:
- Form 1, do-notation.
do v <- mv
f v
- Form 2, do-notation.
do mv
mv2
In Form 1, the first line means that we take the monadic value of
mv and “unpack” it into the usual one called
v . The second line is just the calculation of
f from
v . The result of the
fv string is the result of the entire expression.
In Form 2, the first line "is executed" monadic value ("action")
mv . In the second line, “performed” is another monadic value (“action”) of
mv2 . Thus, we have just a notation that links in the sequence
mv and
mv2 , as the operator
>> does.
The compiler in Haskell converts a convenient
do- notation into a record without
do for Form 1 and Form 2. This is just a syntactic transformation, and the meaning of both records is identical. In addition, both forms can be mixed in one expression of each of the notations. Example:
- mv :: m a
- v1 :: a
- f :: a -> m b
- v2 :: b
- mv3 :: m c
do
v1 <- mv
v2 <- f v1
mv3
return v2
This is exactly the same as:
mv >> = ( \ v1 ->
( f v1 >> = ( \ v2 ->
( mv3 >>
( return v2 ) ) ) ) )
Or without skokbok:
mv >> = \ v1 ->
f v1 >> = \ v2 ->
mv3 >> return v2
You can imagine that when monadic expressions grow, the
do- form remains as easy to read, while the form without
do (also called “Sugarless”) often becomes completely unreadable. That's why
do- abstract exists. Another great thing is that the
do- concept works for
all monads, not just any one.
In addition, you can mix
do- notation and de-sugated notation in one expression. Like this:
do v1 <- mv
v2 <- f v1
mv3 >> return v2
Sometimes this is useful, but can often cause poor readability of the code.
Let's see how our previous examples will look like in
do- notation.
- Read the line, then print it.
readAndPrintLine :: IO ( )
readAndPrintLine =
do
line <- getLine
putStrLn line
- We print two lines, one after another.
- Not a function.
do
putStrLn "This is string 1."
putStrLn "This is string 2."
- Read the line and print it twice.
readAndPrintLineTwice :: IO ( )
readAndPrintLineTwice =
do
line <- getLine
putStrLn line
putStrLn line
Here the code is much easier to read thanks to the
do- notation. Interestingly, it has an added advantage (or disadvantage, depending on what views you hold): Haskell code looks imperative! If we read the code from top to bottom, it looks like an imperative language, in which, instead of assigning an arrow,
<- . Say,
readAndPrintLine can be described as: “call
getLine to read the line that we put in the variable
line ; then call
putStrLn to print this variable. ”This is definitely not what actually happens (for example,
line is not a variable), but you can read it like that. For a large amount of code that performs many input and output actions, a
do- abstract is a very convenient way of writing.
Do-notation has other useful properties. For example, you can embed the
let- expressions and
case expressions in the body of the
do- notation, which is often convenient. I will not go into details, because this is a routine - take any other Haskell tutorial to explore this point.
Next
In the next article, I'll start talking about monads, starting with
Maybe (the monad for calculations where an error can occur) and ending with the list monad (for calculations with multiple results).
Content
Part 1: The BasicsPart 2: functions >> = and returnPart 3: Monad LawsPart 4: Maybe Monad and List Monad