Part 3/3. Ideal VM compiler for ICFPC 2009, on Haskell, with popularizing comments

Ending. Previous parts: 1 and 2

What is left? We missed a place where two adjacent opcodes turn into one. What did we have? According to the specification, the following operations took place:

flag = m20 > 0
if (flag) m222 = m3 else m222 = m4

')
Exploring binary files for how these constructs are used, we realized that they walk in a pair of Cmp, then Phi. Surely, because it was compiled by the organizers from some of their original formulas and with some compiler. Well, if so, then we glue them a little together in one operation:

-- convert 2-op conditional operator to 1-op removePhi [] = [] removePhi ((Cmpz cond condr1) : (Phi addr r1 r2) : xs) = Noop addr : If cond condr1 addr r1 r2 : removePhi xs removePhi (x : xs) = x : removePhi xs 

Here the list is handled as follows: the input of the function is a list of opcodes, the output is a list in which this pair [Cmp, Phi] is replaced by [Noop, If]. Noop is inserted in order to leave the addresses of all commands unchanged (i.e. so that their number and position does not change, because their position in memory is one of their operands).

Here, we define that for an empty list and do nothing. For the list starting with our two required teams, we replace them with two new teams, and then process the rest of the list. And if the list starts with everything else (which corresponds to the third line, the rest is given the name "x"), then we leave this rest unchanged, and process the rest of the list "xs".

Now the main part of the program. Haskell Entry Point - main function

main = do (len,buf) <- readMyFile 

This is the file we read (see above)

let nframes = fromInteger $ len `div` 12 -- . -- opcode/data memory in tuples ids <- mapM ( \ i -> instruction (plusPtr buf $ i * 12 ) i) [ 0 .. nframes - 1 ]

Here, every i (from 0 to nframes-1) was put on the result - a pair (instructon, data), which was obtained by adding the corresponding offset (i * 12) to the pointer (buf) and interpreting the command at this address (call instruction with where it lies with the instruction number, for even-odd sampling).

-- code AST let code = removePhi $ map disasm $ zip (map fst ids) [ O.. ] 

Now we (read to the right), took from the list of pairs (ids) all (map) first (fst) elements (only instructions), made pairs (zip) with numbers from zero to as many as necessary ([0 ..] ), then for all these pairs (instruction, address) missed (map) via disasm, got a list of Op-s, and then removed Phi from them, making If instead of it (using removePhi).

-- print code putStrLn $ produceCode code (map snd ids) 

Further, having code and data (via map snd ids - take the second halves of the pairs "(instruction, data)", they called produceCode with two parameters and printed the result (putStrLn).

free buf

And this line just does not need comments.

Conclusion

This “compiler” is written at the level of a novice haskelist, of course. But even at this level, I became noticeably different from Java, where I write every day at work.

First of all, the absence of any “utility functions” and classes required in java for the implementation of all sorts of patterns there that would surely be used by any faithful javist.

Secondly, the distribution of time. I spent a large chunk of time researching work with Ptr - because I chose this method of converting 8 bytes to Double, and I didn’t do it before. Then, about two-thirds of the time was spent writing exactly the decoding of the file, the creation of the AST and its basic serialization into Haskell code (linear). And only a third of the time I actually spent on the algorithm for converting flat opcodes into a tree-like AST, and the algorithm turned out by itself.

In short, I experienced a feeling of deep satisfaction. Sorry, there was no time for the IFCPC.

Source: https://habr.com/ru/post/70185/

All Articles

Part 3/3. Ideal VM compiler for ICFPC 2009, on Haskell, with popularizing comments

More articles: