📜 ⬆️ ⬇️

Monads in 15 minutes

Introduction


At the conference YOW! 2013 is one of the developers of the Haskell language, prof. Philip Wadler showed how monads allow pure functional languages ​​to perform essentially imperative operations, such as I / O and exception handling. It is not surprising that the interest of the audience in this topic was generated by the explosive growth of publications about monads on the Internet. Unfortunately, most of these publications use examples written in functional languages, implying that newcomers to functional programming want to learn about monads. But monads are not specific to Haskell or functional languages, and may well be illustrated with examples in imperative programming languages. This is the purpose of this manual.

How does this guide differ from the rest? We will try to “open” the monads in no more than 15 minutes using only intuition and a few basic examples of Python code. We therefore will not begin to theorize and delve into philosophy, talking about burritos , space suits , desks, and endofunctors.

Motivational examples


We consider three problems related to the composition of functions. We will solve them in two ways: the usual imperative and with the help of monads. Then we compare the different approaches.

1. Logging


Suppose we have three unary functions: f1 , f2 and f3 , which accept a number and return it increased by 1, 2, and 3, respectively. Also, each function generates a message that represents a report on the operation performed.
 def f1(x): return (x + 1, str(x) + "+1") def f2(x): return (x + 2, str(x) + "+2") def f3(x): return (x + 3, str(x) + "+3") 

We would like to chain them to handle the parameter x , in other words, we would like to calculate x+1+2+3 . In addition, we need to get a human-readable explanation of what each function has done.
')
You can achieve the desired result in the following way:
 log = "Ops:" res, log1 = f1(x) log += log1 + ";" res, log2 = f2(res) log += log2 + ";" res, log3 = f3(res) log += log3 + ";" print(res, log) 

This solution is not ideal, since it consists of a large number of uniform glue code. If we want to add a new function to our chain, we will have to repeat this binding code. In addition, manipulations with the variables res and log affect the readability of the code, making it difficult to follow the main logic of the program.

Ideally, the program should look like a simple chain of functions, like f3(f2(f1(x))) . Unfortunately, the data types returned by f1 and f2 do not correspond to the types of parameters f2 and f3 . But we can add new functions to the chain:
 def unit(x): return (x, "Ops:") def bind(t, f): res = f(t[0]) return (res[0], t[1] + res[1] + ";") 

Now we can solve the problem as follows:
 print(bind(bind(bind(unit(x), f1), f2), f3)) 

The following diagram shows the computational process that occurs when x=0 . Here, v1 , v2 and v3 are the values ​​derived from the calls unit and bind .



The unit function converts the input parameter x to a tuple of number and string. The bind function calls the function passed to it as a parameter and accumulates the result in the intermediate variable t .

We were able to avoid repetition of the linking code by placing it in the bind function. Now, if we have a function f4 , we simply include it in the chain:
 bind(f4, bind(f3, ... )) 

And we will not need to make any other changes.

2. List of intermediate values


We will also begin this example with simple unary functions.
 def f1(x): return x + 1 def f2(x): return x + 2 def f3(x): return x + 3 

As in the previous example, we need to assemble these functions to calculate x+1+2+3 . We also need to obtain a list of all values ​​obtained as a result of the operation of our functions, that is, x , x+1 , x+1+2 and x+1+2+3 .

Unlike the previous example, our functions are composable, that is, the types of their input parameters coincide with the type of result. Therefore, a simple chain f3(f2(f1(x))) returns the final result. But in this case, we will lose intermediate values.

Solve the problem "in the forehead":
 lst = [x] res = f1(x) lst.append(res) res = f2(res) lst.append(res) res = f3(res) lst.append(res) print(res, lst) 

Unfortunately, this solution also contains a lot of glue code. And if we decide to add f4 , we again have to repeat this code to get the correct list of intermediate values.

Therefore, we add two additional functions, as in the previous example:
 def unit(x): return (x, [x]) def bind(t, f): res = f(t[0]) return (res, t[1] + [res]) 

Now we will rewrite the program as a chain of calls:
 print(bind(bind(bind(unit(x), f1), f2), f3)) 

The following diagram shows the computational process that occurs when x=0 . Again, v1 , v2 and v3 denote the values ​​resulting from the calls unit and bind .



3. Empty values


Let's try to give a more interesting example, this time with classes and objects. Suppose we have an Employee class with two methods:
 class Employee: def get_boss(self): # Return the employee's boss def get_wage(self): # Compute the wage 

Each object of class Employee has a manager (another object of class Employee ) and a salary, which can be accessed through appropriate methods. Both methods can also return None (the employee has no supervisor, salary is unknown).

In this example, we will create a program that shows the salary of the head of this employee. If the manager is not found, or his salary cannot be determined, the program will return None .

Ideally, we need to write something like
 print(john.get_boss().get_wage()) 

But in that case, if any of the methods returns None , our program will end with an error.

An obvious way to handle this situation might look like this:
 result = None if john is not None and john.get_boss() is not None and john.get_boss().get_wage() is not None: result = john.get_boss().get_wage() print(result) 

In this case, we allow unnecessary calls to get_boss and get_wage . If these methods are hard enough (for example, accessing a database), our solution is no good. Therefore, change it:
 result = None if john is not None: boss = john.get_boss() if boss is not None: wage = boss.get_wage() if wage is not None: result = wage print(result) 

This code is optimal in terms of calculations, but is hard to read due to the three nested if . Therefore, we will try to use the same trick as in the previous examples. We define two functions:
 def unit(e): return e def bind(e, f): return None if e is None else f(e) 

And now we can build the entire solution into one line.
 print(bind(bind(unit(john), Employee.get_boss), Employee.get_wage)) 

As you probably already noticed, in this case we didn’t have to write the unit function: it simply returns the input parameter. But we will leave it so that it will be easier for us to summarize our experience later.

Note also that in Python we can use methods as functions, which allowed us to write Employee.get_boss(john) instead of john.get_boss() .

The following diagram shows the calculation process when John does not have a manager, that is, john.get_boss() returns None .



findings


Suppose we want to combine functions of the same type f1 , f2 , , fn . If their input parameters coincide in type with the results, we can use a simple chain of the form fn(… f2(f1(x)) …) . The following diagram shows a generalized calculation process with intermediate results, denoted as v1 , v2 , , vn .



Often this approach is not applicable. Types of input values ​​and function results may vary, as in the first example. Alternatively, the functions can be composable, but we want to insert additional logic between calls, as in Examples 2 and 3 we inserted an aggregation of intermediate values ​​and a null value check, respectively.

1. Imperative decision


In all the examples, we first used the most straightforward approach, which can be displayed in the following diagram:



Before calling f1 we did some initialization. In the first example, we initialized a variable to store the common log, in the second, for a list of intermediate values. After that, we interleaved the function calls with some kind of linking code: we calculated the aggregate values, checked the result to None .

2. Monads


As we saw with examples, imperative decisions have always suffered from verbosity, repetition, and confusing logic. To get a more elegant code, we used a certain design pattern, according to which we created two functions: unit and bind . This pattern is called a monad . The bind function contains the glue code while the unit performs initialization. This allows you to simplify the final decision to one line:
 bind(bind( ... bind(bind(unit(x), f1), f2) ... fn-1), fn) 

The calculation process can be represented by the diagram:



Calling unit(x) generates an initial value of v1 . Then bind(v1, f1) generates a new intermediate value v2 , which is used in the next call to bind(v2, f2) . This process continues until the final result is obtained. By defining different unit and bind within this template, we can combine various functions into a single chain of calculations. Monad libraries ( for example, PyMonad or OSlash, - approx. Transl. ) Usually contain ready-to-use monads (pairs of functions unit and bind ) to implement certain compositions of functions.

To chain functions, the values ​​returned by unit and bind must be of the same type as the bind input parameters. This type is called monadic . In terms of the above diagram, the type of variables v1 , v2 , , vn must be a monadic type.

Finally, consider how to improve the code written using monads. Obviously, repeated bind calls look inelegant. To avoid this, we define another external function:
 def pipeline(e, *functions): for f in functions: e = bind(e, f) return e 

Now instead of
 bind(bind(bind(bind(unit(x), f1), f2), f3), f4) 

we can use the following abbreviation:
 pipeline(unit(x), f1, f2, f3, f4) 


Conclusion


Monads are a simple and powerful design pattern used to compose functions. In declarative programming languages, it helps implement such imperative mechanisms as logging or input / output. In imperative languages
it helps to generalize and shorten the code linking a series of calls of functions of the same type.

This article provides only a superficial, intuitive understanding of monads. You can find out more by accessing the following sources:

  1. Wikipedia
  2. Monads in Python (with nice syntax!)
  3. Monad tutorials timeline

Source: https://habr.com/ru/post/445800/


All Articles