Explaining Haskell I / O without monads

This article explains how to do input and output in Haskell, without trying to give any insight about monads in general. We will start with the simplest example, and then gradually move on to more complex ones. You can read the article to the end, and you can stop after any section: each subsequent section will allow you to cope with new tasks. We assume familiarity with the basics of Haskell, in the volume of chapters 1 through 6 of the book "Programming in Haskell" Graham Hutton . [Approx. of a translator: chapters “Introduction”, “First steps”, “Types and classes”, “Definition of functions”, “Selections from lists”, “Recursive functions”]

Main functions

In this tutorial, we will use four standard I / O functions:

readFile :: FilePath -> IO String - file read
writeFile :: FilePath -> String -> IO () - write to file
getArgs :: IO [String] - getting command line arguments (from the System.Environment module)
putStrLn :: String -> IO () - output a string, and carry over a string after it, to the console

Simple input / output

The simplest useful form of I / O: read the file, do something with its contents, and then write the results to a file.

 main :: IO ()
 main = do
    src <- readFile "file.in"
    writeFile "file.out" (operate src)

 operate :: String -> String
 operate = ... is your function

This program reads file.in, performs the function of operate on its contents, and then writes the result to file.out. The main function contains all I / O operations, and the operate function is pure . When writing, you do not need to understand any I / O details. The first two years of programming on Haskell, I used only this model and it was quite enough.

Action list

If the template described in the previous section is not sufficient for your tasks, then the next step is to use the list of actions. The main function can be written like this:

 main :: IO ()
 main = do
     x1 <- expr1
     x2 <- expr2
     ...
     xN <- exprN
     return ()

First comes the do keyword, then the sequence of instructions xI <- exprI , and everything ends with return () . In each instruction to the left of the arrow is a sample (most often just a variable) of some type t , and to the right is an expression of the type IO t . Pattern-related variables can be used in subsequent instructions. If you want to use an expression whose type is different from IO t , then you need to write xI <- return (exprI) . The function return :: a -> IO a takes any value and wraps it into type IO.

As a simple example, we can write a program that takes command-line arguments, reads the file specified by the first argument, works with its contents, and then writes to the file specified by the second argument:

 main :: IO ()
 main = do
     [arg1, arg2] <- getArgs
     src <- readFile arg1
     res <- return (operate src)
     _ <- writeFile arg2 res
     return ()

operate is still a pure function. In the first line after do command line arguments are extracted using pattern matching. The second line reads the file whose name is specified in the first argument. The third line uses return for a pure value of operate src . The fourth line writes the result to a file. This does not give any useful result, so we ignore it by writing _ <- .
')

Simplify I / O

This action list template is very rigid, and people usually simplify the code with the following three rules:

Instead of _ <- x you can simply write x .
If there is no connecting arrow ( <- ) on the last but one line and the expression is of type IO () , then the last line with return () can be deleted.
x <- return y can be replaced by let x = y (if you do not use variable names again).

Using these rules, we can rewrite our example:

 main :: IO ()
 main = do
     [arg1, arg2] <- getArgs
     src <- readFile arg1
     let res = operate src
     writeFile arg2 res

Nested I / O

So far, only the main function has the IO type, but we can create new functions of this type to avoid repeating the code. For example, we can write a helper function to print beautiful headings:

 title :: String -> IO ()
 title str = do
     putStrLn str
     putStrLn (replicate (length str) '-')
     putStrLn ""

We can use this function several times inside main :

 main :: IO ()
 main = do
     title "Hello"
     title "Goodbye"

Return values in IO

Until now, all the functions that we wrote were of type IO (), which allows us to perform input-output actions, but does not allow us to produce interesting results. To return x value of , we write the return x line in the last line of the do block. Unlike the return in imperative languages, this return must be on the last line.

 readArgs :: IO (String, String)
 readArgs = do
     xs <- getArgs
     let x1 = if length xs> 0 then xs !!  0 else "file.in"
     let x2 = if length xs> 1 then xs !!  1 else "file.out"
     return (x1, x2)

This function returns the first two command line arguments, or defaults if there are fewer than two arguments on the command line. Now we can use it in our program:

 main :: IO ()
 main = do
     (arg1, arg2) <- readArgs
     src <- readFile arg1
     let res = operate src
     writeFile arg2 res

Now, if less than two arguments are given, the program will not fall, but uses the default file names.

Select I / O actions

So far, we have only seen a static list of I / O instructions that are executed in order. With if we can choose which actions to perform. For example, if the user has not entered any arguments, we can report this:

 main :: IO ()
 main = do
     xs <- getArgs
     if null xs then do
         putStrLn "You entered no arguments"
      else do
         putStrLn ("You entered" ++ show xs)

To select actions, the last instruction in the do block is to do if , and continue with do in each of its branches. The only subtle point is that else must indent at least one space more than if . This is widely regarded as an error in the definition of Haskell, but for the moment, this additional space is indispensable.

Respite

If you started reading without knowing the I / O in Haskell, and got so far, then I advise you to take a break (drink some tea with cake; you deserve it). The functionality described above is all that imperative languages allow to do, and it is a useful starting point. Just as functional programming provides much more efficient ways of working with functions, treating them as values, it allows us to consider values and I / O actions, which we will do in the rest of the article.

Work with IO values

Until now, all instructions were executed immediately, but we can also create variables of type IO. Using the above function title above, we can write:

 main :: IO ()
 main = do
     let x = title "Welcome"
     x
     x
     x

Instead of performing the action via <- , we put the IO value itself into the variable x . x is of type IO () , so now we can write in a string to perform the action recorded in it. By writing x three times, we perform this action three times.

Passing actions as arguments

We can also pass IO values as arguments to functions. In the previous example, we performed the action title "Welcome" three times, but how could we execute it fifty times? We can write a function that takes an action and a number, and performs this action the appropriate number of times:

 replicateM_ :: Int -> IO () -> IO ()
 replicateM_ n act = do
     if n == 0 then do
         return ()
      else do
         act
         replicateM_ (n-1) act

Here we used a selection of actions to decide when to stop and recursion to continue execution. Now we can rewrite the previous example like this:

 main :: IO ()
 main = do
     let x = title "Welcome"
     replicateM_ 3 x

Of course, the for statement in imperative languages allows you to do the same thing as the replicateM_ function, but Haskell's flexibility allows you to define new control instructions — a very powerful tool. The replicateM_ function defined in Control.Monad is similar to ours, but more general; so you can use it instead of our version.

IO in data structures

We have seen how IO values are passed as an argument, so it’s not surprising that we can put them in data structures, such as lists and tuples. The sequence_ function takes a list of actions and executes them in turn:

 sequence_ :: [IO ()] -> IO ()
 sequence_ xs = do
     if null xs then do
         return ()
      else do
         head xs
         sequence_ (tail xs)

If there are no items in the list, sequence_ finishes work with return () . If there are any elements in the list, sequence_ selects the first action with head xs and executes it, and then calls sequence_ on the rest of the tail xs list. Like replicateM_ , sequence_ already present in Control.Monad in a more general form. Now you can easily rewrite replicateM_ using sequence_ :

 replicateM_ :: Int -> IO () -> IO ()
 replicateM_ n act = sequence_ (replicate n act)

Pattern Matching

In Haskell, it is much more natural to use pattern matching than null/head/tail . If there is exactly one instruction in the do block, the word do can be removed. For example, in the definition of sequence_ this can be done after the equal sign and after then .

 sequence_ :: [IO ()] -> IO ()
 sequence_ xs =
     if null xs then
         return ()
      else do
         head xs
         sequence_ (tail xs)

Now we can replace if with a mapping, as in any similar situation, without worrying about IO :

 sequence_ :: [IO ()] -> IO ()
 sequence_ [] = return ()
 sequence_ (x: xs) = do
     x
     sequence_ xs

Last example

As a final example, imagine that we want to perform some operations with each file specified on the command line. Using what we have learned, we can write:

 main :: IO ()
 main = do
     xs <- getArgs
     sequence_ (map operateFile xs)

 operateFile :: FilePath -> IO ()
 operateFile x = do
     src <- readFile x
     writeFile (x ++ ".out") (operate src)

 operate :: String -> String
 operate = ...

Design I / O in the program

A Haskell program usually consists of an outer shell of actions that calls pure functions. In the previous example, main and operateFile are part of the shell, and operate and all the functions that it uses are clean. As a general design principle, try to make the action layer as thin as possible. The shell should briefly perform the necessary input, and the main work should be assigned to the clean part. Using explicit I / O in Haskell is necessary, but it should be kept to a minimum — pure Haskell is much prettier.

What's next

Now you are ready to do any I / O that your program needs. To consolidate skills, I advise you to do something from the following list:

Write a lot of Haskell code.
Read chapters 8 and 9 of “Programming in Haskell”. Expect to spend about 6 hours thinking about sections from 8.1 to 8.4 (it would be nice to get to the hospital with a slight injury).
Read Monads as containers , an excellent introduction to monads.
Look at the documentation on the laws of monads , and find where I used them in this article.
Read the documentation of all the functions in Control.Monad , try to implement them, and then use them when writing programs.
Implement and use the state monad .

Source: https://habr.com/ru/post/80396/

All Articles