From the translator:
This is a free translation of Gabriel Gonzales ’s Purify code using free monads article on the use of free monads to represent code as a syntax tree, followed by a guided interpretation.
On Habré there are other articles Gabriel - "Cooperative flows from scratch in 33 lines on Haskell" and "What are good free monads . "
To read this article, you need to know what a free monad is and why it is a functor and a monad. You can learn about this in these two translations or in the article to which the author himself refers .
All comments of the translator are in italics.
For all comments related to the translation, contact the PM.Experienced Haskell programmers often advise beginners to make programs as
clean as possible.
A function is called pure if it is deterministic (the return value is uniquely determined by the values of all formal arguments) and has no side effects (that is, it does not change the state of the execution environment). In classical mathematics, λ-calculus and combinatorial logic, all functions are pure. Clean provides many practical benefits:
- you can formally prove some properties of the written code,
- in addition, you can easily review the code and say what it does,
- Finally, you can drive through QuickCheck .
For the demonstration, I will use such a simple
echo
program:
import System.Exit main = do x <- getLine putStrLn x exitSuccess putStrLn "Finished"
In the above program, however, there is one drawback: it mixes business logic and side effects. In a specific case, there is nothing bad in it, I always write simple programs that I can fully remember. However, I hope you will be interested in cool things, which are obtained when the side effects are separated from business logic.
Free monads
For starters, we need to clear our program. We can
always separate unclean areas from any code using
free monads . Free monads allow you to divide any unclean program into a pure representation of its behavior and a minimal unclean interpreter:
import Control.Monad.Free import System.Exit hiding (ExitSuccess) data TeletypeF x = PutStrLn String x | GetLine (String -> x) | ExitSuccess instance Functor TeletypeF where fmap f (PutStrLn str x) = PutStrLn str (fx) fmap f (GetLine k) = GetLine (f . k) fmap f ExitSuccess = ExitSuccess type Teletype = Free TeletypeF putStrLn' :: String -> Teletype () putStrLn' str = liftF $ PutStrLn str () getLine' :: Teletype String getLine' = liftF $ GetLine id exitSuccess' :: Teletype r exitSuccess' = liftF ExitSuccess run :: Teletype r -> IO r run (Pure r) = return r run (Free (PutStrLn str t)) = putStrLn str >> run t run (Free (GetLine f )) = getLine >>= run . f run (Free ExitSuccess ) = exitSuccess echo :: Teletype () echo = do str <- getLine' putStrLn' str exitSuccess' putStrLn' "Finished" main = run echo
The code rewritten in this way collects
all unclean actions in the
run
function. I prefer to call this process “code cleaning” because we clear all the logic of the program to clean code, leaving aside the minimal impure residue only in our
run
interpreter.
')
Proof of
It seems that we did not benefit much from the purification of the code, but paid for this pleasure with many lines of code
(and also with the overhead of memory and time) . However, now we can prove some properties of our code using
equational inference (equational reasoning - output through equivalent transformations) .
For example, anyone reading this must have noticed an obvious bug in the original
echo
program:
import System.Exit main = do x <- getLine putStrLn x exitSuccess putStrLn "Finished"
The last command will never be executed ... or will it be? How to prove it?
In fact, we cannot prove that the last command will never be executed, because it
is not . The author of the
System.Exit
module could easily change the definition of
exitSuccess
to
exitSuccess :: IO () exitSuccess = return ()
This program still passes type checking, but now it also successfully prints
Finished
to the console.
But for our cleaned version, we can prove that any command after
exitSuccess'
will never be executed:
exitSuccess' >> m = exitSuccess'
Let us prove with an equational conclusion, this most significant purification profit. The equational inference implies that we can take any expression and simply substitute the definitions of the component functions. Each step of the following proof is accompanied by a commentary that indicates the specific functional definition used to proceed to the next step.
exitSuccess' >> m -- exitSuccess' = liftF ExitSuccess = liftF ExitSuccess >> m -- m >> m' = m >>= \_ -> m' = liftF ExitSuccess >>= \_ -> m -- liftF f = Free (fmap Pure f) = Free (fmap Pure ExitSuccess) >>= \_ -> m -- fmap f ExitSuccess = ExitSuccess = Free ExitSuccess >>= \_ -> m -- Free m >>= f = Free (fmap (>>= f) m) = Free (fmap (>>= \_ -> m) ExitSuccess) -- fmap f ExitSuccess = ExitSuccess = Free ExitSuccess -- fmap f ExitSuccess = ExitSuccess = Free (fmap Pure ExitSuccess) -- liftF f = Free (fmap Pure f) = liftF ExitSuccess -- exitSuccess' = liftF ExitSuccess = exitSuccess'
Notice how in the last steps we turned over the equalities and moved back from the functional definition
(on the right) to the defined expression
(on the left) . Such a technique is completely admissible, because, due to purity, the equal sign in Haskell does not mean assignment, but it means really equality! This means that, talking about code in terms of equality, we can translate expressions in both directions with respect to the equal sign. Impressive!
For those who have heard about the λ-calculus (from the translator)In fact, equality in Haskell means not quite equality in the traditional sense. In λ-calculus there is a concept of reduction - a term transformation according to a certain rule. Formally speaking, this is just a binary relation on a set of terms. A term can be reduced to several different terms, and this ambiguity determines the multiplicity of reduction strategies (energetic - first the argument is reduced, then the expression; lazy - the whole expression first, then the argument as necessary; and the like). With a fixed reduction strategy, the term is either reduced in a certain way to another term, or not at all. The latter case is called the normal form of a term; we think of it as a completely completed calculation. The reductions themselves are generated by some rules. The best known are β- and η-reductions. In a sense, each functional equality (function definitions) express the same rules. When a calculation occurs, these rules work in one direction. If the rule is closed on itself, an infinite recursion occurs, like inf = inf :: a
. However, in the metatheory we can take this binary relation and close it by transitivity and symmetry, obtaining a binary reduction ratio relation. By the Church – Rosser theorem, the reduction equality M = N is valid if and only if there is a term (not necessarily in normal form) P such that both terms M and N are reduced to a common value P in a finite number of steps. Here’s how since such equalities are discussed in this article.
It is equally important that the above proof is valid, no matter how this free monad is interpreted. We can change the
run
function to any other interpreter, incl. clean and equality is still true
exitSuccess' >> m = exitSuccess'
This means that we can safely use it as a
rule override in the GHC (GHC rewrite rule) to optimize the passage through our program:
{-# RULES "exit" forall m. exitSuccess' >> m = exitSuccess' #-}
... and we can guarantee that the rule will always be safely applied and never broken.
Code Reasoning
I would like to prove that our program always displays the same string that it receives at the input. Unfortunately, this we also can not prove, because it
is not . The author
(the maintainer) of the
putStrLn
function could at any time change his mind and override
putStrLn str = return ()
In fact, we cannot even prove that for the cleaned free monad version this is done. The
run
function uses the same
putStrLn
, so the program is in any case subject to the same bug threat. Nevertheless, we can still prove something about the freest monad, considering it through a pure interpreter:
runPure :: Teletype r -> [String] -> [String] runPure (Pure r) xs = [] runPure (Free (PutStrLn yt)) xs = y:runPure t xs runPure (Free (GetLine k)) [] = [] runPure (Free (GetLine k)) (x:xs) = runPure (kx) xs runPure (Free ExitSuccess ) xs = []
Now we can prove
runPure echo = take 1
using the
take
function from the
Prelude
package of the Haskell98 standard. I leave this as an exercise for the reader, since this post is already quite long.
Pay attention to another important point. Exploring
echo
through the prism of clean code, we find a small detail that we did not initially see: the user can enter nothing! In this case, our
(last) interpreter should return an empty list, as well as the function
take _ [] = []
. Equational withdrawal makes us rigorous when a simple statement “our program always produces the same string it receives” is not enough. The more you work with such equalities, the more you improve the skill of reasoning about the properties of your code and find errors in your initial assumptions.
Equally important, equivalent transformations allow you to see your program in a completely new light. We saw that our program became notorious
take 1
after passing through
runPure
, which means that we can borrow our intuition about the
take
function to understand our program and detect such insidious minor bugs.
After we selected the pure part of the code with the
Free
monad and proved its consistency, it becomes a reliable kernel and all we have to do is compile the interpreter from now on. Thus, while we cannot fully prove the correctness of the entire program, we could prove the correctness of everything, with the exception of the interpreter. Equally important is the fact that we reduced the interpreter to an absolutely minimally prone form for attacks, which allows us to keep it in our head and support it manually.
Testing
We cannot prove anything about the code immersed in the
IO
monad. And how could this be done? One could say somehow: “if you compile the code, run the program and enter some single line, the same line will be returned back”, but this is not really an equation, therefore there is nothing strict in such a phrase. However, we can write it as a test for our program.
Unfortunately, unclean code tests do not scale to large and complex programs and, in test-dependent PLs, writing such tests occupies a significant fraction of the time of the entire programming cycle.
Fortunately, we can still easily test clean code with the
QuickCheck library , which
iterates over clean statements and checks them for incorrectness in a fully automatic mode.
For example, suppose you were too lazy to prove
runPure echo = take 1
. We can instead ask QuickCheck to test this proposition for us:
>>> import Test.QuickCheck >>> quickCheck (\xs -> runPure echo xs == take 1 xs) +++ OK, passed 100 tests.
Cool! You can test your code much more actively if it is as clean as possible.
Finally
Equational output works only for clean code, so clean parts of the code serve as a valid kernel, which you can:
- prove soundness , integrity and correctness (prove soundness) ,
- output the properties of his behavior (reason about behavior) ,
- actively test (aggressively test) .
That is why we always fight for the share of clean code in our code and minimize unclean parts.
Fortunately, the free monad
Free
guarantees us an easy achievement of the highest level of cleanliness as possible, and the least amount of unclean code. That is why all experienced Haskelist programmers should be fluent in free monads.
Thanks
Thanks
KolodeznyDiver for help with translation.