📜 ⬆️ ⬇️

Haskell WebSocket Server

Once I had nothing to do, I decided to write a WebSocket server, and after writing, I thought that it might be interesting to someone, what can laziness, functional cleanliness and other lambdas help here.

After reading how the server works in general, I sat down to write. The protocol, by the way, is very simple. The client sends random keys, the server in response confirms the connection, sending md5 from the concatenation of these keys. And then they send each other either binary or textual data, by and large nothing different.

Handshake


Open the draft and see the description of the handshake format:
field = 1*name-char colon [ space ] *any-char cr lf
colon = %x003A ; U+003A COLON (:)
space = %x0020 ; U+0020 SPACE
cr = %x000D ; U+000D CARRIAGE RETURN (CR)
lf = %x000A ; U+000A LINE FEED (LF)
name-char = %x0000-0009 / %x000B-000C / %x000E-0039 / %x003B-10FFFF
; a Unicode character other than U+000A LINE FEED (LF), U+000D CARRIAGE RETURN (CR), or U+003A COLON (:)
any-char = %x0000-0009 / %x000B-000C / %x000E-10FFFF
; a Unicode character other than U+000A LINE FEED (LF) or U+000D CARRIAGE RETURN (CR)

')
Well, so write it down:
field = ( many1 nameChar <& colon <& spaces ) <&> ( many anyChar <& cr <& lf ) where <br>
spaces = ignore ( many space ) [ () ] <br>
colon = lit ':' char<br>
space = lit ' ' char<br>
cr = lit '\r' char<br>
lf = lit '\n' char<br>
unicodeChar = optIf ( <= '\x10FFFF' ) char<br>
nameChar = optIf ( `notElem` ": \r\n" ) unicodeChar<br>
anyChar = optIf ( `notElem` "\r\n" ) unicodeChar<br>


Let me explain what is happening on the example of the first line.
field = ( many1 nameChar <& colon <& spaces ) <&> ( many anyChar <& cr <& lf ) where <br>
spaces = ignore ( many space ) [ () ] <br>


many1 describes a value that occurs 1 or more times, many - 0 or more. Operators &> and <& consistently connect two rules, while indicating that we are interested in the value of only one of them. In this case, the values ​​that will pass according to the rules of colon and spaces do not interest us. The <&> operator allows you to get both values ​​as a tuple.
The lit function sets a hard value to be met, and optIf imposes a restriction.

The message itself consists of a capital line, fields and data of a certain length, coming after the fields.
This is written no more difficult:
message = ( toMessage , fromMessage ) `wrap` ( leadingLine <&> many field ) where <br>
toMessage ( ll , fs ) = Message { <br>
messageLeadingLine = ll , <br>
messageFields = fs } <br>
fromMessage ( Message { messageLeadingLine = ll , messageFields = fs } ) = ( ll , fs ) <br>
<br>
body len = cr &> lf &> times len unicodeChar<br>
<br>
leadingLine = many anyChar <& cr <& lf<br>


With leadingLine and body everything is simple, but the wrap function appears in the message definition. The fact is that the rule a <&> b defines the rule for a tuple, and we need some type of our own. Therefore, we provide two functions for converting from and to a tuple.

All right, we learned to sort the abstract message with fields, now it is possible to look also in the direction of Opening (from the client) and Response (the answer of the server).
Opening must contain certain fields (some are optional), so we will wrap the message rule in optIf ; and also contain a body length of 8 bytes.
opening = ( toOpening , fromOpening ) `wrap` ( optIf hasFields message <&> body 8 ) where <br>

Functions toOpening , fromOpening I will not give.
With Response everything is exactly the same.

Frames


Suppose, with a handshake figured out, now is to take up the message.
In the same section of the draft, you can see the description of the frame format:
frames = *frame
frame = text-frame / binary-frame
text-frame = (%x00-7F) *(%x00-FE) %xFF
binary-frame = (%x80-FF) length < as many bytes as given by the length >
length = *(%x80-FF) (%x00-7F)

We will rewrite behind that only exception that we leave the same closing-frame :
frames = ( takeWhile ( not . isClosing ) , takeWhile ( not . isClosing ) ) `wrap` many frame<br>
frame = optIf isText textFrame <|> optIf isBinary binaryFrame <|> optIf isClosing closingFrame<br>

Operator <|> is an alternative. First applies the left rule, if unsuccessful, the right one.

Frames themselves:
textFrame = ( TextFrame , \ ( TextFrame s ) -> s ) `wrap` ( textFlag &> many frameChar <& frameFF ) where <br>
textFlag = ignore ( optIf ( <= 0x7F ) word8 ) 0x00 <br>
binaryFrame = ( BinaryFrame , \ ( BinaryFrame s ) -> s ) `wrap` ( binaryFlag &> byteSourceLength frameLength ) where <br>
binaryFlag = ignore ( optIf ( liftM2 ( && ) ( > 0x7F ) ( /= 0xFF ) ) word8 ) 0xF0 <br>
closingFrame = check ( 0xFF , 0x00 ) ( word8 <&> word8 ) ClosingFrame <br>

The ignore function ignores the associated value, and when writing, substitutes the value specified by the second argument. Those. when reading a textFrame we consider text frames as all frames whose flag is not more than 0x7F, however, when serializing a message, we always set 0.
byteSourceLength loads / saves a cloud of bytes, prefetching it with the number of these bytes, which [number] will be loaded / saved using the transmitted rule ( frameLength ).
The length in WebSocket has a variable size in bytes. The sign of the last byte is the unspecified high bit.
frameLength = ( \ ( hs , l ) -> toLength ( hs ++ [ l ] ) , ( init &&& last ) . fromLength ) `wrap` ( many highWord <&> lowWord ) where <br>

I will omit the definitions toLength , fromLength , highWord and lowWord .

Server


Now you can try to write something like a server.
start port onAccept = do <br>
sock <- S . socket S . AF_INET S . Stream S . defaultProtocol<br>
S . bindSocket sock $ S . SockAddrInet port S . iNADDR_ANY<br>
S . listen sock S . maxListenQueue<br>
let <br>
-- . ( ), <br>
-- "" . <br>
canDie e = if fromException e == Just ThreadKilled then throwIO ThreadKilled else return () <br>
-- . <br>
th <- fork $ forever $ canDie `handle` acceptClient sock onAccept<br>
return $ Server th<br>


Connection Waiting Function:
acceptClient socket onAccept = ignore $ accept socket onReceived where <br>

accept accepts the connection and passes the entire input stream to the onReceived function as a lazy string.

onReceived sock income = do <br>
-- . , anything ( ), <br>
-- , opening. <br>
( o , tailData ) <- letFail $ decode ( opening <&> anything ) income<br>
-- . <br>
r <- letFail ( responseTo o >>= mapException show . encode response ) <br>
-- . <br>
send sock r<br>
let con = connection ( openingChannel o ) ( openingHost o ) ( openingOrigin o ) ( openingProtocol o ) sock<br>
let <br>
-- . callback. <br>
onConnect ClosingFrame = close con `finally` acceptOnClose handlers con<br>
-- . <br>
onConnect f = acceptOnMessage handlers con f<br>
-- callback "". <br>
fork $ acceptOnOpen handlers con<br>
<br>
-- - , . <br>
switch ( const $ return () ) ( mapM_ onConnect ) $ decode frames tailData<br>


Working with lazy lists is convenient for understanding: there is a list of messages, we call the corresponding callback for everyone; but there is one nuance.

For example, we want to present all user input as a lazy ByteString .
If we write it like this:
input <- fix $ \ loop -> unsafeInterleaveIO $ liftM2 ( : ) getLine loop<br>
let byteString = pack $ map charToByte input<br>

Then trying to print a lazy ByteString , we can be very surprised at the lack of effect. The point is elementary in the strictness of the pack function; it needs the entire line at once.
In this case, it would be more correct to get a lazy list of all user inputs, and then use the fromChunks function. Then immediately as you type, our ByteString no longer be an empty promise, but will honestly contain part of the entire input.

Conclusion


Why did I write all this? Well, I hope someone has an additional interest in Haskell has awakened, or diminished, skepticism about the futility of functional priblud.

Source: https://habr.com/ru/post/108890/


All Articles