Part 2/3. Ideal VM compiler for ICFPC 2009, on Haskell, with popularizing comments

Start here.

Now we understand how we will decode the entire file, not just one instruction.

readMyFile = withBinaryFile "bin4.obf" ReadMode $ \ h -> do len <- hFileSize h buf <- mallocBytes $ fromInteger len hGetBuf h buf $ fromInteger len return (len, buf) 

This is an imperative piece, because they mostly work with files. withBinaryFile opens the file, performs the specified “user-defined” function, passing it the handle, and closes the file, and returns what the user-defined function returns. Here, after the $ sign, we described a “user-defined” function with one parameter h (from the handle). This function gets the file size, allocates a buffer, reads into a buffer, and returns the buffer itself and its length (in bytes). Note that the “user function” has no name here and starts like this:
')

\h -> function_body --

Further, the function of receiving instructions and data words from the buffer. The current pointer to the buffer (already shifted to the desired number of bytes) is passed here, and the instruction index is passed. From the specification, we remember about the even-odd index, and that the opcode and the data word are something like that. That is, the instruction starts from the zero byte, then from the eighth, and the data word (floating), respectively, then from the fourth, then from the zero.

instruction :: Ptr a -> Int -> IO (Word32,Double) instruction ptr i = do let (iaddr,daddr) = if i `mod` 2 /= O then ( O , 4 ) else ( 8 , O ) -- instruction/data instr <- peek (plusPtr ptr iaddr) :: IO Word32 -- dta <- peek (plusPtr ptr daddr) :: IO Double -- return (instr, dta)

Note that the operation "unequal" sounds in a mathematical way: "/ =".
Now let's go through the "cycle" throughout the buffer, and return the list of pairs (instruction given):

ast = mapM ( \ i -> instruction (plusPtr buf $ i * 12 ) i) [ O .. nframes - 1 ] 

A lot of things happen here. First, the map function and others like it (in particular, mapM) work as follows: they are given a “user function” that converts one list item, and the list itself is passed to it, and then map applies this user function to each element of the list, and forms a new list of the values of this function. The cycle of the map is somewhere inside there (we will not go into details).

let q = map (\h -> h * 2) [1,2,3,4,5]

returns [2,4,6,8,10], this is a classic example. And in our task, the input indices are (from zero to the end), and the output is a list of the results of calling the “instruction” function with each index. It is sometimes difficult to understand right away, but it is time to go - python and ruby are moving in this direction, and with them a cohort of languages (of those that are not there, of course).

ast = mapM ( \ i -> instruction (plusPtr buf $ i * 12 ) i) [ O .. nframes - 1 ]<br>

Returning to our mapM, the function passed to it is called instruction, which is fed 0, 1, and so on, as the second argument, and as the first, the buffer is transferred with offsets of 12 bytes further and further for each iteration (1 opcode + 1 data word = 12 bytes). plusPtr forms a new pointer, separated from the specified by the specified number of bytes.

That's the whole cycle. As a result, we have a list of "flat" opcodes (flat, because they are translated one by one according to the specification):

[(Add 0 77 66, 0.0), (Sub 1 0 102, 0.0), ...]

What does it mean: add cells 77 and 66, put the result in 0. Then subtract the contents of cell 102 from the contents of cell 0, put the result in 1 ... and so on. This is our code. But the data: in the zero cell 0.0, in the first 0.0, and so on ...

At this point, you can stop and generate a non-ideal code in any language, at least in assembly language (after all, operations from the specification of the problem are close to it). To do this, you need to add a translation to the string (in the desired language) of each of the flat operations, and load the code in the loop. But it will be a literal execution, and therefore slowly and tedious.

Why slow? Because there will be a write operation in memory of some results that are used only once, and which, in truth, it is better to store somewhere in registers or even on the stack (the C / Haskell / Java compiler or whatever, we will present our AST, and now we have Haskel as “anything”. And if we write a record in the structure field for every sneeze, then no compiler registers us from there, but only one record in memory.

Therefore, we will analyze what categories our “memory” cells are divided into.

1) constants: read only, never write
2) real memory: read before write, write later read
3) temporary variables: write, then several reads
4) one-time variables: write, then one read. Go to expressions with brackets.

To understand who reads and writes to the memory cell, we need to describe the behavior of each instruction in the following section: some instructions only read memory (output from memory to port), others only write (from port to memory), most read and write (arithmetic, conditional).

We describe the behavior of each instruction - what it reads (consumes).

consumesData (Add _ r1 r2) = [r1,r2] consumesData (Sub _ r1 r2) = [r1,r2] consumesData (Mul _ r1 r2) = [r1,r2] consumesData (Div _ r1 r2) = [r1,r2] consumesData (Output r1 r2) = [r2] consumesData (If _ condr1 _ r1 r2) = [condr1,r1,r2] consumesData (Sqrt _ r2) = [r2] consumesData (Copy _ r2) = [r2] consumesData _ = [] 

The underlining in the latter case means “all the others”, and in the first cases it means that we are not interested in what the constructor is in this place for the argument. As you can see, pattern matching is again used here.

Similarly, we describe the behavior of each operation with regard to what it writes (produces):

producesData (Add addr _ _) = [addr] producesData (Sub addr _ _) = [addr] producesData (Mul addr _ _) = [addr] producesData (Div addr _ _ ) = [addr] producesData (If _ _ addr _ _) = [addr] producesData (Input addr _) = [addr] producesData (Copy addr _) = [addr] producesData (Sqrt addr _ ) = [addr] producesData _ = [] 

We describe what each instruction does in relation to the ports - which ports it reads.

readsPort (Input _ port) = [port] readsPort _ = [] 

And the entry - what ports does she write:

writesPort (Output r1 _) = [r1] writesPort _ = [] 

Next, for a shorter entry (author fad)

cmap = concatMap 

What does concatMap do? It does the same thing as map, only after that does concat. Concat "glues the list one level." A small example:

concat [ "hello" , "africa" ] = "helloafrica" <br>
concat [[ 1 , 2 , 3 , 4 ],[ 5 , 6 , 7 , 8 ]] = [ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ]<br>
concat [[]] = []<br>
<br>
map ( \ i -> [ 1 .. i]) [ 1 .. 5 ] = [[ 1 ],[ 1 , 2 ],[ 1 , 2 , 3 ],[ 1 , 2 , 3 , 4 ],[ 1 , 2 , 3 , 4 , 5 ]]<br>
concatMap ( \ i -> [ 1 .. i]) [ 1 .. 5 ] = [ 1 , 1 , 2 , 1 , 2 , 3 , 1 , 2 , 3 , 4 , 1 , 2 , 3 , 4 , 5 ]<br>

Now the most terrible and big piece:

produceCode ast dta = let inports = cmap readsPort ast :: [Int] outports = cmap writesPort ast :: [Int] 

apply readsPort / writesPort to each opcode, and add (concat) all lists into one list. Thus, we received lists of input and output ports in general.

-- consumes,produces,outputs to port [(memory/port ref,op address)] lookup tables consumes = cmap ( \ (d,a) -> consumesData d `zip` [a,a .. ] ) (ast `zip` [ O.. ]) :: [(Int,Int)] produces = cmap ( \ (d,a) -> producesData d `zip` [a,a .. ] ) (ast `zip` [ O.. ]) outputsPort = cmap ( \ (d,a) -> writesPort d `zip` [a,a .. ] ) (ast `zip` [ O.. ])

here we built several lookup tables (reference books?) to which you can ask the following questions:

What is the address of the instruction that writes to the specified address? (in general, the question is stupid, because, by the definition of the problem, this instruction is located at address N, but this I just realized)
What addresses are located instructions that read the specified cell?
What is the address of the instruction that writes to the specified port?

In Haskell, a primitive lookup table is a list of pairs (key, value). We now have functions that search for all values by a given key and return a list. Here they are:

-- op address that reads/writes given mem/port reads m = map snd $ filter ((m == ) . fst) consumes :: [Int] writes m = map snd $ filter ((m == ) . fst) (produces) :: [Int] outputs m = map snd $ filter ((m == ) . fst) outputsPort :: [Int] 

They simply return the list of addresses who reads the specified memory location, or writes, or writes the specified port. How it works:

reads m = map snd $ filter ((m == ) . fst) consumes^M 

means literally: take the second elements, first filtering all those who have the first element equal to the specified m from the directory consumes (calculated earlier). How it works?

the filter function takes as input a “user function” ((m ==). fst) and a list. The custom function is fed to the input of each element from the list, and it returns True or False. From this list, depending on the result, elements are included in the resulting list or not. It remains a smaller list, therefore. This list is the second argument of the map function (as was discussed earlier about the '$' operator). Next map applies the snd function to each element of the list and generates a list of results.

A few things left unresolved. The snd / fst functions return the second or first element of the pair, respectively, that is, they are engaged in the decomposition of the pair, and in simple terms, they bite the second or first element:

fst (1,2) = 1
snd (1,2) = 2
fst ("Hello", 2+2) = "Hello"
snd ("Hello", 2+2) = 4

And now one of the interesting moments in HASKEL is the composition of functions. Consider the example:

fx = sin (cos (sqrt (x))), - :
fx = sin $ cos $ sqrt x

Now move the brain and see that the “x” always passes through the grinder of functions: first, sqrt, then cos, then through sin. Surely it would be possible to plot the function fx, and since it is possible to plot a graph, then surely the sequence of applications of these three functions corresponds to one single function that gives the same result, only mathematicians did not give it another name !!! Forgive me people with formal thinking.

Now, suppose we have a function on a haskel that plots any transferred function, for example

plot :: (Double -> Double) -> Double -> Double -> IO() ( IO () void, : plot Double Double, Double ( ), void)
plot sin 0 6.28 -- sin 0 6.28
plot cos 0 3.14 -- , , .

Here we see that we pass the sin or cos function as such, the argument will be passed to it during the construction of the schedule, those who build it. How can we pass on a whole grinder of functions, which did not come up with a name, but we know how it develops?

There is an urgent need to describe such a meat grinder. Consequently:

fun1 = sin -- 1
fun2 = sin . cos . sqrt -- , -

Now about the incomplete use. Here we have a function that takes two arguments, for example, exponentiation:

pow 2 3 = 8 --
pow2:
pow2 = pow 2

It turns out that we passed only one of the two required arguments, but no one stopped or scolded us! What is the type of the pow2 function? Very simply, she wants another Double argument, and adds it, and calls the pow function with this argument, and the two is already there!

pow2 4 = 16

It turns out that the notation for describing the Haskel types is also not taken from the ceiling: watch your hands:

pow :: Double -> Double -> Double -- double, double, double
pow2 :: Double -> Double -- double, double

and since pow2 was obtained by specifying one argument to the pow function, we can conclude that adding one argument to pow generates a new function!

pow :: Double -> (Double -> Double) -- .

Therefore, it is possible to add brackets to the right, but the meaning will be the same. It also follows from this that:

pow 2 3 = (pow 2) 3, that is, we first generate a function, and then add the missing argument to it (apply the function to the triple).

Let's return to our sheep.

reads m = map snd $ filter ((m == ) . fst) consumes 

in the filter is a function: (m ==). fst. Here we see the point (composition) and partial application: m ==. This m == is a function that receives 1 argument as input, substitutes it to the right of the equal sign, and voila! The function m == returns the result of comparing its argument with m. If m is 5, then ((m ==) 5) returns True.

Now we read everything from right to left: we have a meat grinder, what goes into it, is first fed to the fst function (which, as we know, extracts the first element from the pair), then this value is fed to m ==, which takes a number, compares it to m and returns a boolean. Thus, a couple (a, b) enters the meat grinder, and Boolean comes out, is it equal to this m.

Now we have to choose all the references to the cells that are written or read in order to make a start from:

-- all memory references (including temporaries etc - from plain program) alldata = nub $ sort $ cmap consumesData ast ++ (map recast $ cmap producesData ast) :: [Int] 

Here (read from right to left) "cmap producesDataAst ast" returns a list of all written cells in the form of a flat list of their addresses, similarly happens with consumesData, then both lists are simply glued together (++), then sorted, and then duplicates are removed from them (nub) . In principle, nub does not require a sorted list, but I did not know this before:

nub [1,2,1] = [1,2]
nub [2,1,2] = [2,1]

-- constants constants = filter ( \ m -> O == (length $ writes m)) alldata :: [Int] 

But here the following happened: we selected from alldata all addresses to which no one ever writes. And since they only read from them, then this is certainly a constant. So we made a list.

-- all persistent (optimized memory) persistents = filter ( \ m -> (head $ reads m) <= (head $ writes m)) (alldata \\ constants) :: [Int] 

And here we have selected all the variables that should be saved from time to time between iterations. To do this, we took alldata, subtracted the constants from it (operation \\ is the difference of the lists), and found who reads first and who writes first. If they read before the place where they write, then they assume that there is something there! Remained from last time! So, in this way they made a list of those places that are actually the memory of our black box, which is accessible for reading and writing.

-- temporaries which are reused parts of expressions lets = filter ( \ m -> 1 < (length $ reads m)) (alldata \\ constants) \\ persistents :: [Int] 

Now we are from the rest (because we have subtracted everything that we still know from alldata), we have found those cells that are read more than once. Since these are not permanent data (we have excluded them), these are certainly temporary variables that are first calculated and then used several times. Temporary variables are used within one iteration, and outside it they are not needed.

-- expressions to inline (variables that are used 1 time) onerefs = filter ( \ m -> (length $ reads m) == 1 && (length $ writes m) == 1 ) ((alldata \\ constants) \\ persistents) :: [Int] 

And here we calculated the variables that are written once and then read 1 time. For them, we will not start anything at all, and expressions with brackets will be formed from them. This class of variables was needed by the authors of the black box, because they have such a low-level language, and in our high-level result they will live in the registers of the processor, but the compiler will take care of this without our knowledge.

Now we should from flat operations (in which only addresses that are read are written in the course of operation) are mentioned, go to the tree opcodes, that is, those that we have with the Exp suffix - they contain references to their own kind, forming a tree . To do this, we describe the function that will be generated to us at the address of the whole tree Op. It will be recursive. Because if we give it the address where the constant lies, then we don’t have to do much. And if we give it an address to which the result of the addition of two other cells is written, then we need to generate Op, which adds two other Opes, and they can be constants there, or suddenly there are multiplications ...

-- geherates reference to the expression identified by given address, -- this excludes address where to store it, only value is obtained ref :: DestAddr -> Op ref a | elem a constants = Const (dta !! a) | elem a onerefs = geneval a | elem a lets = ReadVarExp a | elem a persistents = ReadExp a | otherwise = trace "oops1" $ undefined 

Here we have a ref function, in which there are some pre-conditions superimposed on the input parameter, the syntax of such conditions is a vertical line, then a condition, and then an equal sign, followed by the function body, if the condition is true. It's like a big if ... elseif ... elseif ... else ... endif. The last else we have described through trace "oops" $ undefined - will output an error (trace) and stop the program (undefined).

We look, if the input address belongs to constants, then we generate access to the constant (just Op, denoting a constant). What kind of constant - defines the parameter constructor Const - is simply the value of the memory at the specified address. dta is a list containing all the memory cells sequentially, and dfa !! a is access to the a-th element of the list.

If the input address belongs to onerefs (memory is written 1 time, then 1 time is read), then you just need to form a tree node with two other opami (operation with parentheses). Called geneval (described below), which will generate from a simple Op (add two memory cells and put the result in a third) complex (add two other Op, return the result _ _)

Similarly, for reading temporary variables (which are written once, read a lot), and for reading input ports.

The geneval function itself is described below, it uses pattern matching. From the resulting address, we choose a flat Op (from the ast list), and convert it to a tree. Curiously, this function calls ref, described above, so the functions are mutually recursive.

-- turns plain code in tree code, converting memory refs to ops via "ref" geneval a = e $ ast !! a e (Add addr r1 r2) = AddExp (ref r1) (ref r2) e (Sub addr r1 r2) = SubExp (ref r1) (ref r2) e (Mul addr r1 r2) = MulExp (ref r1) (ref r2) e (Div addr r1 r2) = DivExp (ref r1) (ref r2) e (If cmdop cmdr addr r1 r2) = IfExp cmdop (ref cmdr) (ref r1) (ref r2) e (Input addr r1) = ReadPortExp r1 e (Sqrt addr r1) = SqrtExp (ref r1) e (Copy addr r1) = ref r1 e x = trace (show x) $ undefined

Op Copy (Copy) simply means to take the result to the original address. The Add / Sub / Mul / Div commands are literal transformations from flat to tree. Input is converted to read from the port. Reading from temporary variables or from constant variables is converted above into ref.

Next we write the generation

retval = "module Vm where \n data VMState = VMState { " ++ (intercalate "," $ map (( "m" ++ ) . show) persistents ++ map (( "o" ++ ) . show) outports ) ++ "::Double } \n " ++ 

This generated part describing the structure itself. Here, obviously, (++) is the gluing operator of lists (lines), intercalate is gluing several lists, interleaving them with the specified ones (read: insert a separating comma), and in the structure we will include all persistent fields with the prefix m (memory) and all output ports with the prefix o - output ports. At the end we give them the type Double and the line feed will not be forgotten.

You can give them not the type Double, but the type! Double, then this will be a structure not with lazy fields, but with non-lazy. The study shows (see previous post) that when calculating all output ports, the difference in speed is more than 20 times, with lazy, of course, slower. Well, it's up to the reader to substitute or not substitute strictness annotation (waxed sign). Why did I say “all output ports”? Because with lazy calculations, you can get the value of this structure after the Nth iteration, and ask only one required port, for example. A large scope of work will begin to unfold, and after a pause, the calculated value will come (for example, when we write

putStrLn $ "o10=" ++ (show $ o10 vmlast)

then the following will happen: “o10 =” will be written to the console, and then there will be a pause, and then the value will be displayed. If the same putStrLn is written again, then there will be no delay, because value is calculated. If you then ask to write o11, then the delay will be again, but probably not so great, since most likely o10 and o11 have some kind of common part of the calculations, which has already been counted as a whole for o10, and will not be considered for o11.

"initState = VMState {" ++ (intercalate "," $ map ( \ a -> "m" ++ (show a) ++ "=" ++ (show $ dta !! (recast a))) persistents ++ map (( ++ "=O" ) . ( "o" ++ ) . show) outports ) ++ "} \n " ++ 

This is where the function for creating the initial value of the black box state was generated. All memory is initialized with values from the binary that were in the right place (dta). All output ports are initialized with zeros. Haskell requires mandatory initialization of structures.

"nextState x (" ++ (intercalate "," $ map (( "i" ++ ) . show) inports) ++ ")= \n let \n " ++ cmap ( \ a -> " t" ++ show a ++ "=(" ++ (ast2haskell "x" $ geneval a) ++ ") :: Double \n " ) lets ++ " in x { \n " ++ intercalate ", \n " ( map ( \ a -> " m" ++ show a ++ "=" ++ (ast2haskell "x" $ geneval a)) persistents ++ map ( \ a -> " o" ++ show a ++ "=" ++ (ast2haskell "x" $ geneval $ head $ outputs a)) outports ) ++ "} \n " ; 

Here the most volumetric part is generated. To begin with, a tuple of input parameters is formed in a form similar to this: (i10, i11, i60000), then there is a “let” into which all temporary (non-permanent) variables fit, with the prefix “t”. Then comes the formation of a new structure in which expressions are written for each of its members: x {m20 = expr1, m25 = expr2 ..., o10 = exprN}. The expressions in the initializations of t and m refer to “tNN”, “iNN” and “mNN x” - the latter means the value of the memory from the previous iteration.

in retval -- .

Ending

Source: https://habr.com/ru/post/70179/

All Articles

Part 2/3. Ideal VM compiler for ICFPC 2009, on Haskell, with popularizing comments

More articles: