Attention! S in Ethereum stands for Security. Part 4. Tools

We present the fourth part of the cycle devoted to typical vulnerabilities, attacks and problem areas, which are inherent in smart contracts in the Solidity language and on the Ethereum platform as a whole. Here we will talk about what tools for analyzing the security of smart contracts exist, and why they are (not) needed.

In the first part, we discussed the front-running attack, various random number generation algorithms and network resiliency with the Proof-of-Authority consensus. The second talked about Integer overflow, ABI encoding / decoding, Uninitialized storage pointer, Type Confusion and how to make a backdoor. In the third part, we touched upon several distinctive features of Solidity and explored some logical vulnerabilities found in contracts. In this part we bring to your attention a review of existing tools for analyzing smart contracts.

All the tools we have divided into several categories that allow you to do:

Next we touch each of them. Some of the tools work only with the source code of the smart contract, others without it. And at once we will make a reservation that it will be a question only of those tools which are known to us, is in open access, and with which we worked. If you have experience with any other tools on this subject, then you are welcome in the comments :)

Debugging

Currently, there is only one (or the only convenient) possibility to debug a smart contract with the source code - Remix IDE . And if you need to understand bytecode, Radare2 or MythrilTraceExplorer can help you .

Remix IDE

If you are a developer, most often you will use Remix for debugging (most likely, you do so). It is very convenient to track the state of local variables (stack), the state of memory, storage, consumed gas and much more.


Figure 1. Remix IDE interface

If the contents of the "Storage completely loaded" tab scare you, we advise you to deal with the storage device. Of the minuses can be noted perhaps that the periodic bugs in the JavaScript VM and the fact that if you use assembler inserts, then line-by-line debugging you can hardly hope.

Radare2

If you do not like Remix, but adore the console, then try Radare2. With it, you can disassemble and debug contracts. For debugging, traces are used (recorded transactions occurred). The rpc radar connects to the node ( ganache-cli is convenient) and tightens the data necessary to repeat the transaction.

In graphics mode, you can open tiles with the contents of the stack and memory:


Figure 2. Radare 2 Interface

You can set breakpoints and write to memory, but we did not find a way to look into storage or modify it.

The Positive Technologies guys have examples of using Radare to disassemble and debug contracts. Since the radar is not the most intuitive tool, we advise you to take a look and get acquainted with its commands.

Automatic vulnerability scan

Before we have already been a comparison of the three tools - Mythrill, Manticore and Oyente. According to the results of this analysis, Mythrill is in the lead, and Oyente is the worst. During the tests, the authors "fed" the tools with contract source codes, so the results do not reflect how well they cope with raw bytecode, but mostly coincide with our impressions.

Mythril

Mythril has broad functionality, including for analyzing both the source code and the bytecode of smart contracts. It is noteworthy that it can also work on-chain, that is, online to connect to the node, download the contract and analyze it. He is also a disassembler and can build good graphs (but be careful, he often does it incorrectly, so better try Ethersplay or IDA-EVM). Mythril can also generate a state for a tool called MythrilTraceExplorer, which allows you to debug a smart contract bytecode with character variables:


Figure 3. MythrilTraceExplorer interface

A full list of Mythil and MythrilTraceExplorer options can be found in the documentation , and you can find the secret flag that isn't in it here .

Take the eth_tx_order_dependence_minimal contract from Consensys tests

Source eth_tx_order_dependence_minimal

 contract Benchmark { address public owner; bool public claimed; uint public reward; function Benchmark() public { owner = msg.sender; } function setReward() public payable { require (!claimed); require(msg.sender == owner); owner.transfer(reward); reward = msg.value; } function claimReward(uint256 submission) { require (!claimed); require(submission < 10); msg.sender.transfer(reward); claimed = true; } }

and see how Mythril handles with source codes:

 user% myth -x eth_tx_order_dependence_minimal.sol ==== Ether send ==== Type: Warning Contract: Benchmark Function name: claimReward(uint256) PC address: 693 In the function 'claimReward(uint256)' a non-zero amount of Ether is sent to msg.sender. Call value is storage_1. There is a check on storage index 7. This storage slot can be written to by calling the function 'claimReward(uint256)'. --------------------

and now without them:

 user% myth -x -c 60806040523...5f9dc0029 The analysis was completed successfully. No issues were detected.

As you can see, the raw Mythril bytecode could not be analyzed.

Oyente

A tool with its own character execution engine for detecting vulnerabilities in smart contracts, written in Python. Not so long ago, developers have finally added output information about the input data, which will lead to the triggering of the problem. The tool produces more or less adequate results only when analyzing contracts with source codes. If you set it on baytkod, then it will cover only a small part of it and therefore, most likely, it will not find bugs.

By analogy with Mythril, we put the experience on Oyente and see how well it looks for vulnerabilities with and without source codes:

 user% oyente -s eth_tx_order_dependence_minimal.sol INFO:root:contract eth_tx_order_dependence_minimal.sol:Benchmark: INFO:oyente.symExec: ============ Results =========== INFO:oyente.symExec: EVM Code Coverage: 98.3% INFO:oyente.symExec: Integer Underflow: False INFO:oyente.symExec: Integer Overflow: False INFO:oyente.symExec: Parity Multisig Bug 2: False INFO:oyente.symExec: Callstack Depth Attack Vulnerability: False INFO:oyente.symExec: Transaction-Ordering Dependence (TOD): True INFO:oyente.symExec: Timestamp Dependency: False INFO:oyente.symExec: Re-Entrancy Vulnerability: False INFO:oyente.symExec:Flow1 eth_tx_order_dependence_minimal.sol:14:9: Warning: Transaction-Ordering Dependency. owner.transfer(reward) Flow2 eth_tx_order_dependence_minimal.sol:22:9: Warning: Transaction-Ordering Dependency. msg.sender.transfer(reward) INFO:oyente.symExec: ====== Analysis Completed ======

Now let's set the bytecode of this contract:

 user% oyente -b -s eth_tx_order_dependence_minimal.bin INFO:oyente.symExec: ============ Results =========== INFO:oyente.symExec: EVM Code Coverage: 18.0% INFO:oyente.symExec: Callstack Depth Attack Vulnerability: False INFO:oyente.symExec: Transaction-Ordering Dependence (TOD): False INFO:oyente.symExec: Timestamp Dependency: False INFO:oyente.symExec: Re-Entrancy Vulnerability: False INFO:oyente.symExec: ====== Analysis Completed ======

As you can see, the coverage has greatly decreased (from 98 to 18 percent) and the tool does not find problems with the contract.

Developing Oyente is not easy due to (well, very) bad code. For example, parameters from a module to a module are passed through global variables. Most likely because no more than one and a half people work on the project.

Just in case, it is worth keeping Oyente in your toolbox, but it is unknown whether it will develop and improve.

Porosity

Perhaps, the most raspiarenny decompiler. Its main feature is that it does not work.

To demonstrate this feature, let's decompile the toy contract, which for the purity of the experiment will be compiled with the relatively old version of solc'a 0.4.12 of July 1, 2017. Note that the last time Porosity was updated in January 2018, so you can expect that the tool will cope with our contract.

Original:

 contract sample { uint y; function foobar (uint x) { if (x == 0) { y = 0x1337; } else { y = 0xb33f; } } }

Unique decompiler output:

 .\porosity.exe --code $binRuntime --decompile function func_14ba3f12 { if (!msg.value) { } if (arg_4 != 0x0) { store[var_xrpMC] = 0xB33F; } store[var_zdGc8] = 0x1337; return; return; }

When you try to transfer abi, in order to make the tool easier to work with, Porosity will crash.

We could not find the use of Porosity, all attempts to use ended with either receiving meaningless output, or the fall of the tool.

Manticore

A tool for symbolic program execution. Some time ago, EVM support was added to it and is being developed now (and if a permanent developer appears, then things will go even faster). There are a lot of chips in it, but they are not tested and most often do not work. The project also has a rather poor documentation, but we hope that the community will soon correct this shortcoming.

Before we look at examples of using Manticore, it’s worth explaining what symbolic execution is, and the slide from the presentation of Michael Hicks’s "Symbolic Execution for finding bugs" by:


Figure 4. Symbolic Execution for finding bugs. Michael Hicks.

Symbolic execution allows you to understand which input you need to submit to a function / program in order to be at a given point. Roughly speaking, during the symbolic execution, all the restrictions (conditions) on the way of execution to a given point are collected, a formula is made up of these restrictions, and by solving this, we will get the input we need.

For the demonstration, we will solve the "The Lock" problem of the EtherHack contest held at the Positive Hack Days 8 conference. The participants were not given the source code, but only the address of the contract. To solve the task, it was necessary to call the contract with the correct pin-code (its check is implemented in the fallback). The solution involved reverse engineering to understand the pin-code verification algorithm.

Let's automate the solution to this problem. However, we still have to take a look at the disassembled contract code in order to find out what will happen (or not happen) if the participant sends a valid PIN code. Namely: it is worth noting that there is only one sstore instruction in the program, and we can safely assume that it will be executed if the casket key is selected correctly.

Unfortunately, honestly solving the problem does not give bugs in Manticore (we sent bug reports to developers, we hope that errors will be fixed soon), so we have to get out - we rewrote the contract in the source code for Solidity and replaced it with a succession code by multiplications (due to Manticore's bugs, at the current moment it cannot emulate exponentiation with a symbolic base) Of course, solving a task during a competition in such a perverted way is meaningless, but do not forget that we are simply demonstrating a tool! And the presence of the source code does not simplify the work, because the symbolic execution does not work with it, but with bytecode.

Enough excuses and blocking, let's take a look at the script:

the lock solver

 from manticore.ethereum import ManticoreEVM from manticore.core.smtlib import Operators from struct import pack m = ManticoreEVM() m.verbosity(0) contract_source_code = ''' contract lock { uint public unlocked; function unlock(bytes4 cPincode) payable { uint digitPowers = 0; uint iPincode = 0; for (uint i = 0; i < 4; i++) { if (cPincode[i] >= 0x30 && cPincode[i] <= 0x39) { // manticore can't handle pow with symbolic input at the moment uint digit = (uint(cPincode[i]) - 0x30); digit *= digit; digit *= digit; digitPowers += digit; iPincode += (uint(cPincode[i]) - 0x30) * 10**(3-i); } else { revert(); } } if (uint(iPincode) == 0 || uint(iPincode) == 1) { revert(); } if (digitPowers == uint(iPincode)) { unlocked = 0x31337; } } } ''' user_account = m.create_account(balance=10**18) m.world.set_balance(user_account, 10**18) contract_account = m.solidity_create_contract(contract_source_code, owner=user_account) print "[+] Created a contract account 0x%x" % contract_account print '[+] Sending transaction with symbolic input' symbolic_data = m.make_symbolic_buffer(36) m.transaction(caller=user_account, address=contract_account, data=symbolic_data, value=10**18) pincodes = None for state in m.all_states: world = state.platform for where, what in world.get_storage_items(int(contract_account)): _input = Operators.CONCAT(8*4, *world.human_transactions[-1].data[4:8]) # getting 4 bytes after function signature print '[+] Storage write 0x%x' % state.solve_one(what) pincodes = state.solve_n(_input, 10) print '[+] Found %i pincodes:' % len(pincodes) for pincode in sorted(pincodes): print ' "%s"' % pack('>I', pincode) break if pincodes is not None: break

Running it, we get the following conclusion:

 [+] Created a contract account 0x1bfa530d5d685155e98cd7d9dd23f7b6a801cfef [+] Sending transaction with symbolic input [+] Storage write 0x31337 [+] Found 3 pincodes: "1634" "8208" "9474"

We will understand what is happening. First, take a look at the contract code. The unlock function accepts four bytes as input, in which it expects to see a textual representation of a four-digit decimal number. Only those pin codes will fit, the sum of the digits in the fourth power of which is equal to the pin code itself, excluding zero and one. Really,

1^{4} + 6^{4} + 3^{4} + 4^{4} = $ 163

$1 ^ 4 + 6 ^ 4 + 3 ^ 4 + 4 ^ 4 = $ 163$

If the pin code meets these conditions, the number 0x31337 will be written to the unlocked variable located in storage.

Now let's deal with the script. What happens in the first part is easy to understand: a user account is created, then a contract. A transaction is started with symbolic input. This action will activate symbolic execution, which will take a couple of minutes. Upon its completion, we go through all the final states, checking whether something was recorded in the storage, and if it was, find out what input data the transaction was performed with. We extract from them the pin code, four bytes after the signature of the called function. If we solved the problem and did not know how many digits in the pin code the contract expects, then we would get not four, but all thirty two bytes (the default size of the memory cell in EVM). Then on the screen would see similar numbers:

 0x3136333434343434343434343434343434343434343434343434343434343434 0x3832303800000000000000000000000000000000000000000000000000000000

Using the results of the analysis of Manticore, we can not only solve toy puzzles, but also look for toy bugs in toy contracts, simply asking such naive questions like: "What do you need to send to the contract so that all the airings that contained the contract are on our address?"

Let's look at an example with an unprotected wallet from the EthCC 2018 workshop:

UnprotectedWallet

 from manticore.ethereum import ManticoreEVM from manticore.core.smtlib import solver m = ManticoreEVM() # initiate the blockchain source_code = ''' contract UnprotectedWallet{ address public owner; modifier onlyowner { require(msg.sender==owner); _; } function UnprotectedWallet() public { owner = msg.sender; } // this function should be protected function changeOwner(address _newOwner) public { owner = _newOwner; } function deposit() payable public { } function withdraw() onlyowner public { msg.sender.transfer(this.balance); } } ''' # Generate the accounts. Creator has 10 ethers; attacker 0 creator_account = m.create_account(balance=10*10**18) attacker_account = m.create_account(balance=0) contract_account = m.solidity_create_contract(source_code, owner=creator_account) print "Creator account: 0x%x" % creator_account print "Attacker account: 0x%x" % attacker_account # Deposit 1 ether, from the creator contract_account.deposit(caller=creator_account, value=10**18) # Two raw transactions from the attacker symbolic_data = m.make_symbolic_buffer(320) m.transaction(caller=attacker_account, address=contract_account, data=symbolic_data, value=0) symbolic_data = m.make_symbolic_buffer(320) m.transaction(caller=attacker_account, address=contract_account, data=symbolic_data, value=0) for state in m.running_states: # Check if the attacker can ends with some ether balance = state.platform.get_balance(attacker_account) state.constrain(balance > 1) if solver.check(state.constraints): print "Attacker can steal the ether! see %s"%m.workspace m.generate_testcase(state, 'WalletHack')

Obviously, the developers forgot to award changeOwner with the attribute onlyOwner. Because of this, an attacker in two transactions can get funds. As usual, all the fun is at the end of the script. Enumerating the state, we ask the question: "In which of them did the attacker have coins?" And Manticore is able to answer it. We are looking for answers in the mcore_XXX folder in the WalletHack_00000000.tx file:


Figure 5. Result of Manticore.

If Manticore is not abandoned, it will grow into a good tool for flexible analysis of smart contracts, but now it is not yet ready to meet with the real world and real tasks.

Echidna (phasing)

Echidna is an experimental tool designed for fuzzing smart contracts. However, this is not the fuzzy you might think about. In order for Echidna to have the functions of a smart contract, you need to prepare your contract in a special way.

 //  contract EchidnaTest { function sensitiveFunc(uint num, bool stop) { if (num > 5 && !stop) { selfdestruct(msg.sender); } } }

after

 contract EchidnaTest { uint num; bool stop; function num_setter(uint _num) public { num = _num; } function stop_setter(bool _stop) public { stop = _stop; } function echidna_sensitiveFunc() public returns (bool) { if (num > 5 && stop) { selfdestruct(msg.sender); } return true; } }

In other words, you have to "unscrew" the function you want to map. Echidna will do different call sequences for num_setter and stop_setter , and then echidna_sensitiveFunc . Also note that Echidna is expecting (but obviously will not tell you) that your function will return true when everything goes well. Thus, the “wrong” behavior you can mark by returning false , revert() , selfdestruct(0) , etc.

 p4lex@ubuntu:~/tools/echidna$ echidna-test echid.sol ━━━ echid.sol ━━━ ✗ "echidna_sensitiveFunc" failed after 23 tests and 51 shrinks. │ Call sequence: num_setter(6); │ stop_setter(false); ✗ 1 failed.

It is also worth noting that you cannot specify some input data that Echidna will change, so for complex tasks the efficiency of such fuzzing leaves much to be desired. But you probably noticed that you will have to rewrite every function that needs to be interpolated. For the sake of fairness, it is worth noting that if you can write in Haskell, you can still set the input value generators so that they give a specific domain from a set of possible values (for example, you know that num should be more than five).

In general, fuzzing can give its results. If a vulnerability is manifested only after a long chain of actions, symbolic execution will require a lot of CPU and memory costs, unlike fuzzing (which, however, will detect this vulnerability only with a certain probability). More information about the approach used in Echidna can be found in here .

SmartCheck

A highly experimental, closed source tool that reentrancy is seen everywhere. From the pleasant: you can simply copy your contract into the sending form or specify a link to the githab. And since static code analysis is used, your contract is not even obliged to compile. It looks like this:


Figure 6. SmartCheck Interface

In this example, a balance reset occurs after the transfer of Ether, which is the signature for SmartCheck. However, the use of the send and transfer functions in itself prevents reentrancy (we analyzed it in the third part), so we see false positive in the screenshot above.

Reverse engineering

Since at the moment there are no working decompilers in open access, then in order to understand what logic a particular contract implements, you will have to revise. In general, the hope that the decompiler will be there. The first attempts were made at Porosity, and most likely the solution works, but, unfortunately, the guys from RET2 have a closed one. In the meantime, let's take a look at what we can do by being left alone with bytecode.

EVMdis

The oldest and most convenient disassembler. It cannot boast beautiful interfaces, but it has features that are not found in other tools: firstly, it minimizes instructions, instead of a sequence showing one more readable expression, and secondly, for each base unit, EVMdis shows the input stack into a block, including symbolized variables. This greatly speeds up the process of understanding the logic of the code, because without this data, the stack state would have to be restored from the beginning of the function, because EVM is a stack machine.

For example:

 # Stack: [] 0x33 PUSH(0xFFFFFFFF & CALLDATALOAD(0x0) / 0x100000000000000000000000000000000000000000000000000000000) 0x34 DUP1 0x3E JUMPI(:label5, 0x2F54BF6E == POP()) ... :label36 # Stack: [@0x420 @0x408 0x0 @0x2A1 @0x282 :label22 @0x33] 0x43F POP() 0x440 POP() 0x443 PUSH(MLOAD(0x40)) ...

By 0x33, the value from calldata is obtained and put on the stack. Since this value is unknown, it is symbolized. Further, we can observe it in the stack of the label36 base unit among other symbolic and specific values (numbers and transition addresses).

In our opinion, this is the most convenient tool for developing back contracts.

Ethersplay

A plugin for Binary Ninja that adds support for the EVM architecture. Intermediate representations (LLIL, and MLIL) ninjies will not work, because EVM is a stack machine, so you will have to read bare assembler instructions. From the pleasant: the plugin has a dictionary with pairs { : } , so you can immediately identify many known functions (for example, implementing different interfaces of the ERC standards):


Figure 7. Binary Ninjia Interface with Ethersplay Plugin

Also in the plugin there is a feature that shows the state of the stack for each instruction:


Figure 8. Stack Status shown by Ethersplay

But it replaces all character variables with the word Unknown , whereas EVMdis instead writes the address where this character variable appeared on the stack.

Ethersplay did not take root in our arsenal, but perhaps it will be developed to the stage when it becomes a useful tool.

Ida-evm

Processor module for IDA Pro, allowing it to disassemble contracts. It also has a dictionary with function prototypes, as in Ethersplay.


Figure 9. IDA interface with IDA-EVM processor module

However, this is not enough for convenient backward development of contracts, because the state of the stack of the entire program will have to be worked through. In our opinion, this tool does not deserve attention, because most likely it will not receive development, and in its current form it is useless.

Testing

In this section, I would like to talk about what really tends to make a secure contract out of your contract, namely about testing. As practice shows, in a smart contract, which is at least 95% covered by tests, the probability of a serious vulnerability is small. In the process of writing tests, most likely, a lot of necessary checks will be added, but you shouldn’t be deceived, there will be things that can only be prompted by experienced security auditors or a developer. For example, find 0day vulnerabilities in the subject area that the contract implements, or some other logical errors, or suggest how to make certain contract mechanisms (DApp) more efficient.

Truffle

The most convenient way is to conduct testing "on the spot" with the help of Truffle (most smart contracts are being developed with it). In the test folder (appears after truffle init ) you can write tests and run them with the command truffle test . However, hardly anyone can tell better about testing with truffle than official documentation . Also, do not forget to test coverage with the solidity-coverage tool.

web3.js

If for some reason (and there may be a lot of them) you are not satisfied with Truffle, you can write tests using the web3 libraries (they are for different languages). The most convenient way is to use web3.js, since the written test code can be easily transferred to a future DApp as part of logic. The only caveat is use web3.js version ^ 1.0.0. The future is hers, and with version 0.14.0, which will be your first line in Google, it is not compatible. Consider an example:

source code

 pragma solidity ^0.4.23; contract Testable { address public human; uint private counter; constructor() { human = tx.origin; } event CallmeLog(); function callme(uint times) public { counter++; if (times > 1) { callme(--times); } else { emit CallmeLog(); } } }

Before proceeding to the tests themselves, you will have to write a function that will upload a fresh contract to the network, since it only needs to be done once :)

tests

 const solc = require('solc'); const fs = require('fs'); const Web3 = require('web3'); const web3 = new Web3(new Web3.providers.HttpProvider('http://127.0.0.1:8545')); const code = fs.readFileSync('./testable.sol'); async function main() { // when use unlocked account from node let defaultAccs = await web3.eth.getAccounts(); web3.eth.defaultAccount = defaultAccs[0]; // this is for known private key // web3.eth.accounts.wallet.add(privKey.key); let Testable = await deploy(); human_test(Testable); callme_tx_status_test(Testable, 3); callme_event_test(Testable, 3); // let's got new instance Testable = await deploy(); callme_storage_test(Testable, 5); } async function deploy() { let output = solc.compile(code.toString(), 1); let TestableABI = output.contracts[Object.keys(output.contracts)[0]].interface; let TestableBytecode = output.contracts[Object.keys(output.contracts)[0]].bytecode; let Testable = new web3.eth.Contract(JSON.parse(TestableABI),{ from: web3.eth.defaultAccount }); let gasCount = await Testable.deploy({ data: TestableBytecode }).estimateGas(); Testable = await Testable.deploy({ data: TestableBytecode }).send({ gas: gasCount }); return Testable; } async function human_test(instance) { // let's test that contract has owner (human) // use "call" for public variables and constant methods let humanAddress = await instance.methods.human().call(); if (humanAddress !== web3.eth.defaultAccount) { throw "✕ human address is wrong" } else { console.log(" human_test passed"); } } ...

, ( ) . web3_utilz/Testing .

, , , , , , CTF.

(reverse engineering), , . , , - ( cryptoKittes ). , reverse engineering. , , , .

, , p4lex Igor1024 .

Source: https://habr.com/ru/post/353676/

All Articles