📜 ⬆️ ⬇️

Creating a system of formal verification from scratch. Part 1: symbolic virtual machine in PHP and Python

Formal verification is the verification of one program or algorithm with the help of another.

This is one of the most powerful methods that allows you to find in the program all the vulnerabilities or to prove that they are not.

A more detailed description of the formal verification can be seen in the example of solving the problem of the Wolf, Goat, and cabbage in my previous article.
')
In this article, I turn from the formal verification of tasks to programs, and describe
how can you convert them to formal rules systems automatically.

For this, I wrote my analogue of a virtual machine, on symbolic principles.

She parses the program code and translates it into the system of equations (SMT), which can already be solved programmatically.

Since the information on symbolic calculations is presented on the Internet rather fragmentary,
I will briefly describe what it is.

Symbolic calculations are a way to simultaneously run a program on a wide range of data and are the main tool for formal verification of programs.

For example, we can specify the input conditions where the first argument can take any positive values, the second negative, the third - zero, and the output argument, for example, 42.

Symbolic calculations for one run will give us an answer, is it possible to get the desired result and an example of a set of such input parameters. Or proof that there are no such parameters.

Moreover, we can set the input arguments in general as all possible, and choose only the output, for example the administrator password.

In this case, we will find all the vulnerabilities of the program, or we will obtain evidence that the admin password is secure.

You may notice that the classical execution of a program with specific inputs is a special case of a symbolic one.

Therefore, my character VM can work in the emulation mode of a standard virtual machine.

In the comments to the previous article, one can find a fair criticism of formal verification with a discussion of its weak points.

The main problems are as follows:

  1. Combinatorial explosion, since formal verification ultimately rests on P = NP
  2. Call processing to the file system, networks and other external storages is more difficult to verify
  3. Bugs in the specification, when the customer or the programmer has conceived one thing, but he did not accurately describe it in the TK.

As a result, the program will be verified and conform to the specification, but will not do what the creators expected from it.

Since in this article I consider mainly the application of formal verification in practice, I will not beat my forehead against the wall yet, and choose such a system where the algorithmic complexity and the number of external calls are minimal.

Since smart contracts are best suited to these requirements, the choice fell on RIDE contracts from the Waves platform: they are not Turing-complete, and their maximum complexity is artificially limited.

But we will consider them exclusively from the technical side.

In addition, for any contracts, formal verification will be especially needed: it is usually impossible to correct a contract error after its launch.
And the price of such mistakes is very high, since quite large sums of money are often stored on smart contracts.

My symbolic virtual machine is written in PHP and Python, and uses Z3Prover from Microsoft Research to solve the resulting SMT formulas.

Its core has a powerful multi-transactional search, which
allows you to find solutions or vulnerabilities, even if it requires a lot of transactions.
Even Mythril , one of the most powerful character frameworks for searching for Ethereum vulnerabilities, added this feature only a few months ago.

But it is worth noting that the contracts of the ether is more complicated and have Turing completeness.

PHP translates the source code of the RIDE smart contract into a python script, in which the program is presented as a Z3 SMT compatible system of contract state and conditions for their transitions:



Now I will describe what is happening inside, in more detail.

But first, a few words about the language of smart contracts RIDE.

It is a functional and expression-based programming language, lazy in its concept.
RIDE runs in isolation inside the blockchain, and can extract and record information from the vault linked to the user's wallet.

You can attach a RIDE contract to each wallet, and the result of execution will be only TRUE or FALSE.

TRUE means that the smart contract allows the transaction, and FALSE denies it.
A simple example: a script can prohibit a transfer, if the wallet has a balance less than 100.

As an example, I will take the same Wolf, Goat, and Cabbage, but already presented in the form of a smart contract.

The user will not be able to withdraw money from the wallet on which the contract is deployed, until he sends everyone to the other bank.

#      let contract = tx.sender let human= extract(getInteger(contract,"human")) let wolf= extract(getInteger(contract,"wolf")) let goat= extract(getInteger(contract,"goat")) let cabbage= extract(getInteger(contract,"cabbage")) #   -,      4 . #           . match tx { case t:DataTransaction => #       let newHuman= extract(getInteger(t.data,"human")) let newWolf= extract(getInteger(t.data,"wolf")) let newGoat= extract(getInteger(t.data,"goat")) let newCabbage= extract(getInteger(t.data,"cabbage")) #0 ,     ,  1    let humanSide= human == 0 || human == 1 let wolfSide= wolf == 0 || wolf == 1 let goatSide= goat == 0 || goat == 1 let cabbageSide= cabbage == 0 || cabbage == 1 let side= humanSide && wolfSide && goatSide && cabbageSide #    ,        . let safeAlone= newGoat != newWolf && newGoat != newCabbage let safe= safeAlone || newGoat == newHuman let humanTravel= human != newHuman #     ,  -   . let t1= humanTravel && newWolf == wolf + 1 && newGoat == goat && newCabbage == cabbage let t2= humanTravel && newWolf == wolf && newGoat == goat + 1 && newCabbage == cabbage let t3= humanTravel && newWolf == wolf && newGoat == goat && newCabbage == cabbage + 1 let t4= humanTravel && newWolf == wolf - 1 && newGoat == goat && newCabbage == cabbage let t5= humanTravel && newWolf == wolf && newGoat == goat - 1 && newCabbage == cabbage let t6= humanTravel && newWolf == wolf && newGoat == goat && newCabbage == cabbage - 1 let t7= humanTravel && newWolf == wolf && newGoat == goat && newCabbage == cabbage let objectTravel = t1 || t2 || t3 || t4 || t5 || t6 || t7 #        . #     1  0,    #  ,        #  -    side && safe && humanTravel && objectTravel case s:TransferTransaction => #             human == 1 && wolf == 1 && goat == 1 && cabbage == 1 #     case _ => false } 

PHP first of all extracts all variables from the smart contract in the form of their keys and the corresponding logical expression variable.

 cabbage: extract ( getInteger ( contract , "cabbage" ) ) goat: extract ( getInteger ( contract , "goat" ) ) human: extract ( getInteger ( contract , "human" ) ) wolf: extract ( getInteger ( contract , "wolf" ) ) fState: human== 1 && wolf== 1 && goat== 1 && cabbage== 1 fState: wolf: goat: cabbage: cabbageSide: cabbage== 0 || cabbage== 1 human: extract ( getInteger ( contract , "human" ) ) newGoat: extract ( getInteger ( t.data , "goat" ) ) newHuman: extract ( getInteger ( t.data , "human" ) ) goatSide: goat== 0 || goat== 1 humanSide: human== 0 || human== 1 t7: humanTravel && newWolf== wolf && newGoat== goat && newCabbage== cabbage t3: humanTravel && newWolf== wolf && newGoat== goat && newCabbage== cabbage + 1 t6: humanTravel && newWolf== wolf && newGoat== goat && newCabbage== cabbage - 1 t2: humanTravel && newWolf== wolf && newGoat== goat + 1 && newCabbage== cabbage t5: humanTravel && newWolf== wolf && newGoat== goat - 1 && newCabbage== cabbage t1: humanTravel && newWolf== wolf + 1 && newGoat== goat && newCabbage== cabbage t4: humanTravel && newWolf== wolf - 1 && newGoat== goat && newCabbage== cabbage safeAlone: newGoat != newWolf && newGoat != newCabbage wolfSide: wolf== 0 || wolf== 1 humanTravel: human != newHuman side: humanSide && wolfSide && goatSide && cabbageSide safe: safeAlone || newGoat== newHuman objectTravel: t1 || t2 || t3 || t4 || t5 || t6 || t7 

Then PHP converts them into a compatible with the Z3Prover SMT system description on python.
The data is wrapped in a loop, where storage variables get index i, transaction variables index i + 1, and variables with expressions define the rules for transition from the previous state to the next.

This is the very heart of our virtual machine, which provides a multi-transaction search engine.

 fState: And( And( And( human[Steps] == 1 , wolf[Steps] == 1 ) , goat[Steps] == 1 ) , cabbage[Steps] == 1 ) final: fState[Steps] fState: wolf: goat: cabbage: cabbageSide: Or( cabbage[i] == 0 , cabbage[i] == 1 ) goatSide: Or( goat[i] == 0 , goat[i] == 1 ) humanSide: Or( human[i] == 0 , human[i] == 1 ) t7: And( And( And( humanTravel[i] , wolf == wolf[i] ) , goat[i+1] == goat[i] ) , cabbage == cabbage[i] ) t3: And( And( And( humanTravel[i] , wolf == wolf[i] ) , goat[i+1] == goat[i] ) , cabbage == cabbage[i] + 1 ) t6: And( And( And( humanTravel[i] , wolf == wolf[i] ) , goat[i+1] == goat[i] ) , cabbage == cabbage[i] - 1 ) t2: And( And( And( humanTravel[i] , wolf == wolf[i] ) , goat[i+1] == goat[i] + 1 ) , cabbage == cabbage[i] ) t5: And( And( And( humanTravel[i] , wolf == wolf[i] ) , goat[i+1] == goat[i] - 1 ) , cabbage == cabbage[i] ) t1: And( And( And( humanTravel[i] , wolf == wolf[i] + 1 ) , goat[i+1] == goat[i] ) , cabbage == cabbage[i] ) t4: And( And( And( humanTravel[i] , wolf == wolf[i] - 1 ) , goat[i+1] == goat[i] ) , cabbage == cabbage[i] ) safeAlone: And( goat[i+1] != wolf , goat[i+1] != cabbage ) wolfSide: Or( wolf[i] == 0 , wolf[i] == 1 ) humanTravel: human[i] != human[i+1] side: And( And( And( humanSide[i] , wolfSide[i] ) , goatSide[i] ) , cabbageSide[i] ) safe: Or( safeAlone[i] , goat[i+1] == human[i+1] ) objectTravel: Or( Or( Or( Or( Or( Or( t1[i] , t2[i] ) , t3[i] ) , t4[i] ) , t5[i] ) , t6[i] ) , t7[i] ) data: And( And( And( side[i] , safe[i] ) , humanTravel[i] ) , objectTravel[i] ) 

Conditions are sorted and inserted into a script template intended for describing an SMT system in python.

Blank template
 import json from z3 import * s = Solver() Steps=7 Num= Steps+1 $code$ #template, only start rest s.add(data + start) #template s.add(final) ind = 0 f = open("/var/www/html/all/bin/python/log.txt", "a") while s.check() == sat: ind = ind +1 print ind m = s.model() print m print "traversing model..." #for d in m.decls(): #print "%s = %s" % (d.name(), m[d]) f.write(str(m)) f.write("\n\n") exit() #s.add(Or(goat[0] != s.model()[data[0]] )) # prevent next model from using the same assignment as a previous model print "Total solution number: " print ind f.close() 


For the last state from the entire chain, the rules that are specified in the transfer transaction section apply.

So, Z3Prover will search for exactly such sets of states, which in the end will allow to withdraw funds from the contract.

As a result, we automatically get a fully functional SMT model of our contract.
You may notice that it is very similar to the model from my previous article, which I wrote up manually.

Filled template
 import json from z3 import * s = Solver() Steps=7 Num= Steps+1 human = [ Int('human_%i' % (i + 1)) for i in range(Num) ] wolf = [ Int('wolf_%i' % (i + 1)) for i in range(Num) ] goat = [ Int('goat_%i' % (i + 1)) for i in range(Num) ] cabbage = [ Int('cabbage_%i' % (i + 1)) for i in range(Num) ] nothing= [ And( human[i] == human[i+1], wolf[i] == wolf[i+1], goat[i] == goat[i+1], cabbage[i] == cabbage[i+1] ) for i in range(Num-1) ] start= [ human[0] == 1, wolf[0] == 0, goat[0] == 1, cabbage[0] == 0 ] safeAlone= [ And( goat[i+1] != wolf[i+1] , goat[i+1] != cabbage[i+1] ) for i in range(Num-1) ] safe= [ Or( safeAlone[i] , goat[i+1] == human[i+1] ) for i in range(Num-1) ] humanTravel= [ human[i] != human[i+1] for i in range(Num-1) ] cabbageSide= [ Or( cabbage[i] == 0 , cabbage[i] == 1 ) for i in range(Num-1) ] goatSide= [ Or( goat[i] == 0 , goat[i] == 1 ) for i in range(Num-1) ] humanSide= [ Or( human[i] == 0 , human[i] == 1 ) for i in range(Num-1) ] t7= [ And( And( And( humanTravel[i] , wolf[i+1] == wolf[i] ) , goat[i+1] == goat[i] ) , cabbage[i+1] == cabbage[i] ) for i in range(Num-1) ] t3= [ And( And( And( humanTravel[i] , wolf[i+1] == wolf[i] ) , goat[i+1] == goat[i] ) , cabbage[i+1] == cabbage[i] + 1 ) for i in range(Num-1) ] t6= [ And( And( And( humanTravel[i] , wolf[i+1] == wolf[i] ) , goat[i+1] == goat[i] ) , cabbage[i+1] == cabbage[i] - 1 ) for i in range(Num-1) ] t2= [ And( And( And( humanTravel[i] , wolf[i+1] == wolf[i] ) , goat[i+1] == goat[i] + 1 ) , cabbage[i+1] == cabbage[i] ) for i in range(Num-1) ] t5= [ And( And( And( humanTravel[i] , wolf[i+1] == wolf[i] ) , goat[i+1] == goat[i] - 1 ) , cabbage[i+1] == cabbage[i] ) for i in range(Num-1) ] t1= [ And( And( And( humanTravel[i] , wolf[i+1] == wolf[i] + 1 ) , goat[i+1] == goat[i] ) , cabbage[i+1] == cabbage[i] ) for i in range(Num-1) ] t4= [ And( And( And( humanTravel[i] , wolf[i+1] == wolf[i] - 1 ) , goat[i+1] == goat[i] ) , cabbage[i+1] == cabbage[i] ) for i in range(Num-1) ] wolfSide= [ Or( wolf[i] == 0 , wolf[i] == 1 ) for i in range(Num-1) ] side= [ And( And( And( humanSide[i] , wolfSide[i] ) , goatSide[i] ) , cabbageSide[i] ) for i in range(Num-1) ] objectTravel= [ Or( Or( Or( Or( Or( Or( t1[i] , t2[i] ) , t3[i] ) , t4[i] ) , t5[i] ) , t6[i] ) , t7[i] ) for i in range(Num-1) ] data= [ Or( And( And( And( side[i] , safe[i] ) , humanTravel[i] ) , objectTravel[i] ) , nothing[i]) for i in range(Num-1) ] fState= And( And( And( human[Steps] == 1 , wolf[Steps] == 1 ) , goat[Steps] == 1 ) , cabbage[Steps] == 1 ) final= fState #template, only start rest s.add(data + start) #template s.add(final) ind = 0 f = open("/var/www/html/all/bin/python/log.txt", "a") while s.check() == sat: ind = ind +1 print ind m = s.model() print m print "traversing model..." #for d in m.decls(): #print "%s = %s" % (d.name(), m[d]) f.write(str(m)) f.write("\n\n") exit() #s.add(Or(goat[0] != s.model()[data[0]] )) # prevent next model from using the same assignment as a previous model print "Total solution number: " print ind f.close() 


After launch, Z3Prover solves a smart contract and brings us a chain of transactions that will allow you to withdraw funds:

 Winning transaction chain found: Data transaction: human= 0, wolf= 0, goat= 1, cabbage= 0 Data transaction: human= 1, wolf= 0, goat= 1, cabbage= 1 Data transaction: human= 0, wolf= 0, goat= 0, cabbage= 1 Data transaction: human= 1, wolf= 1, goat= 0, cabbage= 1 Data transaction: human= 0, wolf= 1, goat= 0, cabbage= 1 Data transaction: human= 1, wolf= 1, goat= 1, cabbage= 1 Data transaction: human= 1, wolf= 1, goat= 1, cabbage= 1 Transfer transaction 

In addition to the contract of crossing, you can experiment with your own contracts or try this simple example, which can be solved in 2 transactions.

 let contract = tx.sender let a= extract(getInteger(contract,"a")) let b= extract(getInteger(contract,"b")) let c= extract(getInteger(contract,"c")) let d= extract(getInteger(contract,"d")) match tx { case t:DataTransaction => let na= extract(getInteger(t.data,"a")) let nb= extract(getInteger(t.data,"b")) let nc= extract(getInteger(t.data,"c")) let nd= extract(getInteger(t.data,"d")) nd == 0 || a == 100 - 5 case s:TransferTransaction => ( a + b - c ) * d == 12 case _ => true } 

Since this is the very first version, the syntax is very limited and bugs can occur.
In the following articles, I plan to highlight the further development of the VM, and show how you can create formally verified smart contracts with its help, and not just solve them.

Symbolic virtual machine is available at http://2.59.42.98/hyperbox/
Sources are available on github: http://github.com/scp1001/hyperbox
All VM logic is contained in 2 files, hyperbox.php and hyperbox2.php

Source: https://habr.com/ru/post/450016/


All Articles