readMyFile = withBinaryFile "bin4.obf" ReadMode $ \ h -> do <br>
len <- hFileSize h<br>
buf <- mallocBytes $ fromInteger len<br>
hGetBuf h buf $ fromInteger len<br>
return (len, buf)<br>
This is an imperative piece, because they mostly work with files. withBinaryFile opens the file, performs the specified “user-defined” function, passing it the handle, and closes the file, and returns what the user-defined function returns. Here, after the $ sign, we described a “user-defined” function with one parameter h (from the handle). This function gets the file size, allocates a buffer, reads into a buffer, and returns the buffer itself and its length (in bytes). Note that the “user function” has no name here and starts like this:\h -> function_body --
instruction :: Ptr a -> Int -> IO (Word32,Double)<br>
instruction ptr i = do <br>
let (iaddr,daddr) = if i `mod` 2 /= O then ( O , 4 ) else ( 8 , O ) -- instruction/data <br>
instr <- peek (plusPtr ptr iaddr) :: IO Word32 -- <br>
dta <- peek (plusPtr ptr daddr) :: IO Double -- <br>
return (instr, dta)<br>
Note that the operation "unequal" sounds in a mathematical way: "/ =".ast = mapM ( \ i -> instruction (plusPtr buf $ i * 12 ) i) [ O .. nframes - 1 ]<br>
A lot of things happen here. First, the map function and others like it (in particular, mapM) work as follows: they are given a “user function” that converts one list item, and the list itself is passed to it, and then map applies this user function to each element of the list, and forms a new list of the values ​​of this function. The cycle of the map is somewhere inside there (we will not go into details).let q = map (\h -> h * 2) [1,2,3,4,5]
ast = mapM ( \ i -> instruction (plusPtr buf $ i * 12 ) i) [ O .. nframes - 1 ]<br>
[(Add 0 77 66, 0.0), (Sub 1 0 102, 0.0), ...]
consumesData (Add _ r1 r2) = [r1,r2]<br>
consumesData (Sub _ r1 r2) = [r1,r2]<br>
consumesData (Mul _ r1 r2) = [r1,r2]<br>
consumesData (Div _ r1 r2) = [r1,r2]<br>
consumesData (Output r1 r2) = [r2]<br>
consumesData (If _ condr1 _ r1 r2) = [condr1,r1,r2]<br>
consumesData (Sqrt _ r2) = [r2]<br>
consumesData (Copy _ r2) = [r2]<br>
consumesData _ = []<br>
The underlining in the latter case means “all the others”, and in the first cases it means that we are not interested in what the constructor is in this place for the argument. As you can see, pattern matching is again used here.producesData (Add addr _ _) = [addr]<br>
producesData (Sub addr _ _) = [addr]<br>
producesData (Mul addr _ _) = [addr]<br>
producesData (Div addr _ _ ) = [addr]<br>
producesData (If _ _ addr _ _) = [addr]<br>
producesData (Input addr _) = [addr]<br>
producesData (Copy addr _) = [addr]<br>
producesData (Sqrt addr _ ) = [addr]<br>
producesData _ = []<br>
We describe what each instruction does in relation to the ports - which ports it reads.readsPort (Input _ port) = [port]<br>
readsPort _ = []<br>
And the entry - what ports does she write:writesPort (Output r1 _) = [r1]<br>
writesPort _ = []<br>
Next, for a shorter entry (author fad)cmap = concatMap<br>
What does concatMap do? It does the same thing as map, only after that does concat. Concat "glues the list one level." A small example:concat [ "hello" , "africa" ] = "helloafrica" <br>
concat [[ 1 , 2 , 3 , 4 ],[ 5 , 6 , 7 , 8 ]] = [ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ]<br>
concat [[]] = []<br>
<br>
map ( \ i -> [ 1 .. i]) [ 1 .. 5 ] = [[ 1 ],[ 1 , 2 ],[ 1 , 2 , 3 ],[ 1 , 2 , 3 , 4 ],[ 1 , 2 , 3 , 4 , 5 ]]<br>
concatMap ( \ i -> [ 1 .. i]) [ 1 .. 5 ] = [ 1 , 1 , 2 , 1 , 2 , 3 , 1 , 2 , 3 , 4 , 1 , 2 , 3 , 4 , 5 ]<br>
produceCode ast dta = <br>
let <br>
inports = cmap readsPort ast :: [Int]<br>
outports = cmap writesPort ast :: [Int]<br>
apply readsPort / writesPort to each opcode, and add (concat) all lists into one list. Thus, we received lists of input and output ports in general.-- consumes,produces,outputs to port [(memory/port ref,op address)] lookup tables <br>
consumes = cmap ( \ (d,a) -> consumesData d `zip` [a,a .. ] ) (ast `zip` [ O.. ]) :: [(Int,Int)]<br>
produces = cmap ( \ (d,a) -> producesData d `zip` [a,a .. ] ) (ast `zip` [ O.. ])<br>
outputsPort = cmap ( \ (d,a) -> writesPort d `zip` [a,a .. ] ) (ast `zip` [ O.. ])<br>
here we built several lookup tables (reference books?) to which you can ask the following questions:-- op address that reads/writes given mem/port <br>
reads m = map snd $ filter ((m == ) . fst) consumes :: [Int]<br>
writes m = map snd $ filter ((m == ) . fst) (produces) :: [Int]<br>
outputs m = map snd $ filter ((m == ) . fst) outputsPort :: [Int]<br>
They simply return the list of addresses who reads the specified memory location, or writes, or writes the specified port. How it works:reads m = map snd $ filter ((m == ) . fst) consumes^M<br>
means literally: take the second elements, first filtering all those who have the first element equal to the specified m from the directory consumes (calculated earlier). How it works?fst (1,2) = 1
snd (1,2) = 2
fst ("Hello", 2+2) = "Hello"
snd ("Hello", 2+2) = 4
fx = sin (cos (sqrt (x))), - :
fx = sin $ cos $ sqrt x
plot :: (Double -> Double) -> Double -> Double -> IO() ( IO () void, : plot Double Double, Double ( ), void)
plot sin 0 6.28 -- sin 0 6.28
plot cos 0 3.14 -- , , .
fun1 = sin -- 1
fun2 = sin . cos . sqrt -- , -
pow 2 3 = 8 --
pow2:
pow2 = pow 2
pow2 4 = 16
It turns out that the notation for describing the Haskel types is also not taken from the ceiling: watch your hands:pow :: Double -> Double -> Double -- double, double, double
pow2 :: Double -> Double -- double, double
pow :: Double -> (Double -> Double) -- .
reads m = map snd $ filter ((m == ) . fst) consumes<br>
in the filter is a function: (m ==). fst. Here we see the point (composition) and partial application: m ==. This m == is a function that receives 1 argument as input, substitutes it to the right of the equal sign, and voila! The function m == returns the result of comparing its argument with m. If m is 5, then ((m ==) 5) returns True.-- all memory references (including temporaries etc - from plain program) <br>
alldata = nub $ sort $ cmap consumesData ast ++ (map recast $ cmap producesData ast) :: [Int]<br>
Here (read from right to left) "cmap producesDataAst ast" returns a list of all written cells in the form of a flat list of their addresses, similarly happens with consumesData, then both lists are simply glued together (++), then sorted, and then duplicates are removed from them (nub) . In principle, nub does not require a sorted list, but I did not know this before:nub [1,2,1] = [1,2]
nub [2,1,2] = [2,1]
-- constants <br>
constants = filter ( \ m -> O == (length $ writes m)) alldata :: [Int]<br>
But here the following happened: we selected from alldata all addresses to which no one ever writes. And since they only read from them, then this is certainly a constant. So we made a list.-- all persistent (optimized memory) <br>
persistents = filter ( \ m -> (head $ reads m) <= (head $ writes m)) (alldata \\ constants) :: [Int]<br>
And here we have selected all the variables that should be saved from time to time between iterations. To do this, we took alldata, subtracted the constants from it (operation \\ is the difference of the lists), and found who reads first and who writes first. If they read before the place where they write, then they assume that there is something there! Remained from last time! So, in this way they made a list of those places that are actually the memory of our black box, which is accessible for reading and writing.-- temporaries which are reused parts of expressions <br>
lets = filter ( \ m -> 1 < (length $ reads m)) (alldata \\ constants) \\ persistents :: [Int]<br>
Now we are from the rest (because we have subtracted everything that we still know from alldata), we have found those cells that are read more than once. Since these are not permanent data (we have excluded them), these are certainly temporary variables that are first calculated and then used several times. Temporary variables are used within one iteration, and outside it they are not needed.-- expressions to inline (variables that are used 1 time) <br>
onerefs = filter ( \ m -> (length $ reads m) == 1 && (length $ writes m) == 1 ) <br>
((alldata \\ constants) \\ persistents) :: [Int]<br>
And here we calculated the variables that are written once and then read 1 time. For them, we will not start anything at all, and expressions with brackets will be formed from them. This class of variables was needed by the authors of the black box, because they have such a low-level language, and in our high-level result they will live in the registers of the processor, but the compiler will take care of this without our knowledge.-- geherates reference to the expression identified by given address, <br>
-- this excludes address where to store it, only value is obtained <br>
ref :: DestAddr -> Op<br>
ref a <br>
| elem a constants = Const (dta !! a)<br>
| elem a onerefs = geneval a<br>
| elem a lets = ReadVarExp a<br>
| elem a persistents = ReadExp a<br>
| otherwise = trace "oops1" $ undefined<br>
Here we have a ref function, in which there are some pre-conditions superimposed on the input parameter, the syntax of such conditions is a vertical line, then a condition, and then an equal sign, followed by the function body, if the condition is true. It's like a big if ... elseif ... elseif ... else ... endif. The last else we have described through trace "oops" $ undefined - will output an error (trace) and stop the program (undefined).-- turns plain code in tree code, converting memory refs to ops via "ref" <br>
geneval a = e $ ast !! a<br>
e (Add addr r1 r2) = AddExp (ref r1) (ref r2)<br>
e (Sub addr r1 r2) = SubExp (ref r1) (ref r2)<br>
e (Mul addr r1 r2) = MulExp (ref r1) (ref r2)<br>
e (Div addr r1 r2) = DivExp (ref r1) (ref r2)<br>
e (If cmdop cmdr addr r1 r2) = IfExp cmdop (ref cmdr) (ref r1) (ref r2)<br>
e (Input addr r1) = ReadPortExp r1<br>
e (Sqrt addr r1) = SqrtExp (ref r1)<br>
e (Copy addr r1) = ref r1<br>
e x = trace (show x) $ undefined<br>
Op Copy (Copy) simply means to take the result to the original address. The Add / Sub / Mul / Div commands are literal transformations from flat to tree. Input is converted to read from the port. Reading from temporary variables or from constant variables is converted above into ref.retval = "module Vm where \n data VMState = VMState { " ++ <br>
(intercalate "," $ <br>
map (( "m" ++ ) . show) persistents ++ <br>
map (( "o" ++ ) . show) outports<br>
) ++ "::Double } \n " ++ <br>
putStrLn $ "o10=" ++ (show $ o10 vmlast)
"initState = VMState {" ++ <br>
(intercalate "," $ <br>
map ( \ a -> "m" ++ (show a) ++ "=" ++ (show $ dta !! (recast a))) persistents ++ <br>
map (( ++ "=O" ) . ( "o" ++ ) . show) outports<br>
) ++ "} \n " ++ <br>
This is where the function for creating the initial value of the black box state was generated. All memory is initialized with values ​​from the binary that were in the right place (dta). All output ports are initialized with zeros. Haskell requires mandatory initialization of structures."nextState x (" ++ (intercalate "," $ map (( "i" ++ ) . show) inports) ++ ")= \n let \n " ++ <br>
cmap ( \ a -> " t" ++ show a ++ "=(" ++ (ast2haskell "x" $ geneval a) ++ ") :: Double \n " ) lets ++ <br>
" in x { \n " ++ <br>
intercalate ", \n " (<br>
map ( \ a -> " m" ++ show a ++ "=" ++ (ast2haskell "x" $ geneval a)) persistents ++ <br>
map ( \ a -> " o" ++ show a ++ "=" ++ (ast2haskell "x" $ geneval $ head $ outputs a)) outports )<br>
++ "} \n " ;<br>
Here the most volumetric part is generated. To begin with, a tuple of input parameters is formed in a form similar to this: (i10, i11, i60000), then there is a “let” into which all temporary (non-permanent) variables fit, with the prefix “t”. Then comes the formation of a new structure in which expressions are written for each of its members: x {m20 = expr1, m25 = expr2 ..., o10 = exprN}. The expressions in the initializations of t and m refer to “tNN”, “iNN” and “mNN x” - the latter means the value of the memory from the previous iteration.in retval -- .
Source: https://habr.com/ru/post/70179/
All Articles