REST server for a simple Haskell blog

Some time ago I was completely tired of languages with dynamic typing and decided to try to learn something brutally static. Haskell appealed to me with the beauty of the code and the uncompromising desire to clearly separate the pure functions from the side effects. I gulped down a few Haskell books and decided it was time to write something.

And here I was disappointed: I was not able to write anything except hello world. Those. I had some idea how to write any console utility like find or something like that, but the very first meeting with IO destroyed all my ideas. There are a lot of libraries for Haskell, but there is almost no documentation on them at all. Examples of solving typical problems are also very few.

The symptoms are clear, the diagnosis is simple: lack of practice. And for Haskell, this is quite painful, because language is extremely unusual. Even the fact that I know Clojure quite well did not help me at all, since Clojure focuses more on functions, while Haskell focuses on their types.

')

I think a lot of newbies have a problem with the lack of practice in Haskell. Writing something completely without an interface is somehow not interesting, but making a desktop or web application for a novice haskelist is quite difficult. And in this article, I'm going to offer a simple example of how to write a Haskell web application server specifically for those who want to practice Haskell, but do not know which way to approach it.

For the most impatient: the source is here .

I have to say: this is not just another tutorial by Yesod . This framework dictates its own ideas about how to properly create web applications, and I do not agree with everything. Therefore, the base will be a small Scotty library, offering a nice syntax for route description for the Warp web server.

Task

Develop a web application server for a simple blog. The following routes will be available:

GET / articles - list of articles.
GET / articles /: id - a separate article.
POST / admin / articles - create an article.
PUT / admin / articles - update the article.
DELETE / admin / articles /: id - delete the article.

All routes that begin with “/ admin” require user authentication. For stateless-service it is very convenient to use Basic authentication , since Each request contains a username and password.

What is needed?

Some basic Haskell knowledge, a general understanding of monads and functors, program design, I / O, etc.
Cabal utility, ability to use sandboxes, connect libraries, compile and run a project.
MySQL and the most basic knowledge of it.

Architecture

To implement the architecture I propose to use the following libraries.

Web server - Warp.
Router - Scotty.
Application configuration - configurator.
Database access: mysql and mysql-simple.
Database connection pool: resource-pool.
Client interaction - REST using JSON, library - aeson.
wai-extra for basic authentication, since the application will be stateless.

Let's break our application into modules.

Main.hs will contain the code for running the application, the router and the application configuration.
Db.hs - everything related to access to the database.
View.hs - data presentation.
Domain.hs types and functions for working with the subject area.
Auth.hs - functions for authentication.

Getting started

Let's create a simple cabal project for our application.

mkdir hblog cd hblog cabal init

Here you need to answer a couple of questions, while the project type is select Executable, the main file is Main.hs, the source directory is src. Here are the used libraries that need to be added to build-depends in the hblog.cabal file:

  base >= 4.6 && < 4.7 , scotty >= 0.9.1 , bytestring >= 0.9 && < 0.11 , text >= 0.11 && < 2.0 , mysql >= 0.1.1.8 , mysql-simple >= 0.2.2.5 , aeson >= 0.6 && < 0.9 , HTTP >= 4000.2.19 , transformers >= 0.4.3.0 , wai >= 3.0.2.3 , wai-middleware-static >= 0.7.0.1 , wai-extra >= 3.0.7 , resource-pool >= 0.2.3.2 , configurator >= 0.3.0.0 , MissingH >= 1.3.0.1

Now, in order to avoid the hellish confusion with the versions of libraries and their dependencies, let's create a sandbox.

  cabal sandbox init cabal install —dependencies-only

Remember to create the src / Main.hs file.

Let's see how the minimal web application on Scotty works. Documentation and examples of using this micro-framework are very good, so at first glance everything becomes clear. And if you have experience with Sinatra, Compojure or Scalatra - consider yourself lucky, because This experience is completely useful here.

Here is the minimum src / Main.hs:

 {-# LANGUAGE OverloadedStrings #-} import Web.Scotty import Data.Monoid (mconcat) main = scotty 3000 $ do get "/:word" $ do beam <- param "word" html $ mconcat ["<h1>Scotty, ", beam, " me up!</h1>"]

The first line of the code can plunge a beginner into amazement: what else for the overloaded lines? I'll explain now.

Since I, like many others, began to learn Haskell from the books “ Learn you a Haskell for a greater good ” and “ Real World Haskell, ” I immediately became a big problem with word processing. I found the best description of working with text in Haskell in the book Beginning Haskell in Chapter 10.

In short, in practice three basic types of string data are used:

String - a list of characters. This data type is built into the language.
Text is a data type intended for both ASCII and UTF characters. It is located in the text library and exists in two forms: strict and lazy. Read more - here
ByteString - designed to serialize strings to a byte stream. It is delivered in the library bytestring and also in two versions: strict and lazy.

Let's return to the OverloadedStrings header. The thing is that, given the presence of several types of string data, the source code will be filled with calls like T.pack “Hello”, where the “Hello” token must be converted to Text; or B.pack "Hello" where the token needs to be converted to a ByteString. Here, to remove this syntactic garbage, the OverloadedStrings directive is used, which itself converts a string token to the desired string type.

Main.hs file

Main function:

 main :: IO () main = do --      application.conf,         loadedConf <- C.load [C.Required "application.conf"] dbConf <- makeDbConfig loadedConf case dbConf of Nothing -> putStrLn "No database configuration found, terminating..." Just conf -> do --    (    — 5 ,      -- 10) pool <- createPool (newConn conf) close 1 5 10 --   Scotty scotty 3000 $ do --       «static» middleware $ staticPolicy (noDots >-> addBase "static") --   .    logStdout  logStdoutDev middleware $ logStdoutDev --       middleware $ basicAuth (verifyCredentials pool) "Haskell Blog Realm" { authIsProtected = protectedResources } get "/articles" $ do articles <- liftIO $ listArticles pool articlesList articles --     :id       get "/articles/:id" $ do id <- param "id" :: ActionM TL.Text maybeArticle <- liftIO $ findArticle pool id viewArticle maybeArticle --      Article     Article   post "/admin/articles" $ do article <- getArticleParam insertArticle pool article createdArticle article put "/admin/articles" $ do article <- getArticleParam updateArticle pool article updatedArticle article delete "/admin/articles/:id" $ do id <- param "id" :: ActionM TL.Text deleteArticle pool id deletedArticle id

To configure the application, we will use the configurator package. We will store the configuration in the application.conf file, and here is its contents:

 database { name = "hblog" user = "hblog" password = "hblog" }

For the connection pool, use the resource-pool library. Connecting to a database is a pleasure, so it is better not to create it for each request, but to give the opportunity to reuse the old ones. The type of the createPool function is:

 createPool :: IO a -> (a -> IO ()) -> Int -> NominalDiffTime -> Int -> IO (Pool a) createPool create destroy numStripes idleTime maxResources

Here, create and destroy are functions for creating and terminating a connection to the database, numStripes is the number of separate sub-pools of connections, idleTime is the lifetime of an unused connection (in seconds), maxResources is the maximum number of connections in the sub-pool.

To open a connection, use the function newConn (from Db.hs).

 data DbConfig = DbConfig { dbName :: String, dbUser :: String, dbPassword :: String } deriving (Show, Generic) newConn :: DbConfig -> IO Connection newConn conf = connect defaultConnectInfo { connectUser = dbUser conf , connectPassword = dbPassword conf , connectDatabase = dbName conf }

Well, DbConfig itself is created like this:

 makeDbConfig :: C.Config -> IO (Maybe Db.DbConfig) makeDbConfig conf = do name <- C.lookup conf "database.name" :: IO (Maybe String) user <- C.lookup conf "database.user" :: IO (Maybe String) password <- C.lookup conf "database.password" :: IO (Maybe String) return $ DbConfig <$> name <*> user <*> password

Data.Configurator.Config is passed to the input, which we read and parsed from application.conf, and Maybe DbConfig wrapped in an IO shell at the output.

Such a record for beginners may seem a bit incomprehensible, and I will try to clarify what is happening here.

The type of the expression C.lookup conf “database.name” is Maybe String, enclosed in IO. You can extract it from IO like this:

 name <- C.lookup conf "database.name" :: IO (Maybe String)

Accordingly, the constants name, user, password type - Maybe String.

The DbConfig data constructor type is:

 DbConfig :: String -> String -> String -> DbConfig

This function accepts three lines as input and returns DbConfig.

The type of the function (<$>) is:

 (<$>) :: Functor f => (a -> b) -> fa -> fb

Those. it takes an ordinary function, a functor, and returns a functor with a function applied to its value. In short, this is a regular map.

The DbConfig <$> name entry extracts a string from name (the name type is Maybe String) assigns the value to the first parameter in the DbConfig constructor and returns the curried DbConfig in the Maybe shell:

 DbConfig <$> name :: Maybe (String -> String -> DbConfig)

Please note that there is already one less string transmitted.

Type (<*>) is similar to <$>:

 (<*>) :: Applicative f => f (a -> b) -> fa -> fb

It takes a functor whose value is a function, takes another functor and applies the function from the first functor to the value from the second, returning a new functor.

Thus, the DbConfig <$> name <*> user entry is of the type:

 DbConfig <$> name <*> user :: Maybe (String -> DbConfig)

The last String parameter remains, which we fill with the password:

 DbConfig <$> name <*> user <*> password :: Maybe DbConfig

Authentication

In the main function, the last complex construct remains - middleware basicAuth. The type of basicAuth function is:

 basicAuth :: CheckCreds -> AuthSettings -> Middleware

The first parameter is a function that checks the presence of a user in the database, the second one determines which routes require authentication protection. Their types are:

 type CheckCreds = ByteString -> ByteString -> ResourceT IO Bool data AuthSettings = AuthSettings { authRealm :: !ByteString , authOnNoAuth :: !(ByteString -> Application) , authIsProtected :: !(Request -> ResourceT IO Bool) }

The data type AuthSettings is quite complex, and if you want to get deeper with it, see the source here . We are only interested in one parameter here - authIsProtected. This is a function that, by Request, can determine whether to require authentication or not. Here is its implementation for our blog:

 protectedResources :: Request -> IO Bool protectedResources request = do let path = pathInfo request return $ protect path where protect (p : _) = p == "admin" protect _ = False

The pathInfo function has the following type:

 pathInfo :: Request -> [Text]

It takes a Request and returns a list of strings that turned out after dividing the request route into substrings by the delimiter “/”.

Thus, if our request starts with “/ admin”, then the protectedResources function returns IO True, requiring authentication.

But the verifyCredentials function, which checks the user and password, relates to interaction with the database, and therefore about it - below.

Database interaction

Utility functions for extracting data from the database using the connection pool:

 fetchSimple :: QueryResults r => Pool M.Connection -> Query -> IO [r] fetchSimple pool sql = withResource pool retrieve where retrieve conn = query_ conn sql fetch :: (QueryResults r, QueryParams q) => Pool M.Connection -> q -> Query -> IO [r] fetch pool args sql = withResource pool retrieve where retrieve conn = query conn sql args

The fetchSimple function should be used for queries with no parameters, and fetch for queries with parameters. Changing data can be made a function execSql:

 execSql :: QueryParams q => Pool M.Connection -> q -> Query -> IO Int64 execSql pool args sql = withResource pool ins where ins conn = execute conn sql args

If you need to use a transaction, here is the execSqlT function:

 execSqlT :: QueryParams q => Pool M.Connection -> q -> Query -> IO Int64 execSqlT pool args sql = withResource pool ins where ins conn = withTransaction conn $ execute conn sql args

Using the fetch function, you can, for example, find the user's password hash in the database by his login:

 findUserByLogin :: Pool Connection -> String -> IO (Maybe String) findUserByLogin pool login = do res <- liftIO $ fetch pool (Only login) "SELECT * FROM user WHERE login=?" :: IO [(Integer, String, String)] return $ password res where password [(_, _, pwd)] = Just pwd password _ = Nothing

It is needed in the module Auth.hs:

 verifyCredentials :: Pool Connection -> B.ByteString -> B.ByteString -> IO Bool verifyCredentials pool user password = do pwd <- findUserByLogin pool (BC.unpack user) return $ comparePasswords pwd (BC.unpack password) where comparePasswords Nothing _ = False comparePasswords (Just p) password = p == (md5s $ Str password)

As you can see, if the password hash is found in the database, then it can be matched with the password from the request, encoded using the md5 algorithm.

But not only users are stored in the database, but also articles that a blog should be able to create-edit-display. In the Domain.hs file, we define the data type of Article with the fields id title bodyText:

 data Article = Article Integer Text Text deriving (Show)

Now you can define CRUD functions in the database for this type:

 listArticles :: Pool Connection -> IO [Article] listArticles pool = do res <- fetchSimple pool "SELECT * FROM article ORDER BY id DESC" :: IO [(Integer, TL.Text, TL.Text)] return $ map (\(id, title, bodyText) -> Article id title bodyText) res findArticle :: Pool Connection -> TL.Text -> IO (Maybe Article) findArticle pool id = do res <- fetch pool (Only id) "SELECT * FROM article WHERE id=?" :: IO [(Integer, TL.Text, TL.Text)] return $ oneArticle res where oneArticle ((id, title, bodyText) : _) = Just $ Article id title bodyText oneArticle _ = Nothing insertArticle :: Pool Connection -> Maybe Article -> ActionT TL.Text IO () insertArticle pool Nothing = return () insertArticle pool (Just (Article id title bodyText)) = do liftIO $ execSqlT pool [title, bodyText] "INSERT INTO article(title, bodyText) VALUES(?,?)" return () updateArticle :: Pool Connection -> Maybe Article -> ActionT TL.Text IO () updateArticle pool Nothing = return () updateArticle pool (Just (Article id title bodyText)) = do liftIO $ execSqlT pool [title, bodyText, (TL.decodeUtf8 $ BL.pack $ show id)] "UPDATE article SET title=?, bodyText=? WHERE id=?" return () deleteArticle :: Pool Connection -> TL.Text -> ActionT TL.Text IO () deleteArticle pool id = do liftIO $ execSqlT pool [id] "DELETE FROM article WHERE id=?" return ()

The most important here are the insertArticle and updateArticle functions. They take the Maybe Article as input and insert / update the corresponding entry in the database. But where to get this Maybe Article?

Quite simply, the user must pass an Article encoded in JSON in the body of a PUT or POST request. Here are the functions for encoding and decoding Article to and from JSON:

 instance FromJSON Article where parseJSON (Object v) = Article <$> v .:? "id" .!= 0 <*> v .: "title" <*> v .: "bodyText" instance ToJSON Article where toJSON (Article id title bodyText) = object ["id" .= id, "title" .= title, "bodyText" .= bodyText]

To handle JSON, we use the aeson library, more about it here .

As you can see, when decoding, the id field is optional, and if it is not in the line with JSON, then the default value of 0 will be substituted. The id field will not be created when creating an Article, since id must create the database itself. But the id will be in the update request.

Data presentation

Let's go back to the Main.hs file and see how we get the request parameters. You can get the parameter from the route using the param function:

 param :: Parsable a => TL.Text -> ActionM a

And the request body can be obtained by the body function:

 body :: ActionM Data.ByteString.Lazy.Internal.ByteString

Here is a function that can get the request body, parse it and return the Maybe Article

 getArticleParam :: ActionT TL.Text IO (Maybe Article) getArticleParam = do b <- body return $ (decode b :: Maybe Article) where makeArticle s = ""

The last thing left: return the data to the client To do this, we define the following functions in the Views.hs file:

 articlesList :: [Article] -> ActionM () articlesList articles = json articles viewArticle :: Maybe Article -> ActionM () viewArticle Nothing = json () viewArticle (Just article) = json article createdArticle :: Maybe Article -> ActionM () createdArticle article = json () updatedArticle :: Maybe Article -> ActionM () updatedArticle article = json () deletedArticle :: TL.Text -> ActionM () deletedArticle id = json ()

Server performance

For the tests, I used a Samsung 700Z laptop with 8GB of memory and a quad-core Intel Core i7.

1000 consecutive PUT requests to create an article entry.

Average response time: 40 milliseconds, approximately 25 requests per second.
100 threads with 100 PUT requests each.

Average response time: 1248 milliseconds, approximately 80 parallel requests per second.
100 threads of 1000 GET requests returning 10 article entries.

Average response time: 165 milliseconds, approximately 600 requests per second.

Just to make at least something to compare, I implemented exactly the same server in Java 7 and Spring 4 with the Tomcat 7 web server and got the following numbers.

1000 consecutive PUT requests to create an article entry.

Average response time: 51 milliseconds, approximately 19-20 requests per second.
100 threads with 100 PUT requests each.

Average response time: 104 milliseconds, approximately 960 parallel requests per second.
100 threads of 1000 GET requests returning 10 article entries.

Average response time: 26 milliseconds, approximately 3800 requests per second.

findings

If you do not have enough practice in Haskell, and you want to try writing web applications on it, here you will find an example of a simple server with CRUD operations for one entity, Article, described in the article. The application is implemented as a JSON REST service and requires basic authentication on secure routes. MySQL is used for data storage, a connection pool is used to improve performance. Since the application does not store state in the session, it is very easy to scale it horizontally, and besides, the stateless server is ideal for developing microservice architecture .

Using Haskell to develop a JSON REST server has made it possible to get a short and beautiful source that, among other things, is easy to maintain: refactoring, making changes and additions will not require much work, because the compiler itself will check the correctness of all changes. The disadvantage of using Haskell is not very high performance of the received web service in comparison with the similar written in Java.

PS

On the advice of the comments conducted additional testing. Changing the number of threads to N = 8 inclusive - does not affect performance. When reducing N further, the performance drops, because On my laptop 8 logical cores.

Another interesting thing. If you disable saving the record in the database, the average delay in the response of the service in Haskell drops to as much as 6 milliseconds (!), In a similar service in java this time is 80ms. Those. the bottleneck in the shown project is interaction with the database, if you turn it off, then Haskell is 13 times faster than the similar functionality in Java. Memory consumption is also several times lower: approximately 80MB vs. 400MB.

Source: https://habr.com/ru/post/257491/

All Articles