Predicting Random Numbers in Ethereum Smart Contracts

Ethereum has gained immense popularity as a platform for the primary placement of coins (ICO). However, it is not only used for ERC20 tokens. Roulettes, lotteries and card games - all this can be implemented on the Ethereum blockchain. Like any implementation, the Ethereum blockchain defies fake; it is decentralized and transparent. Ethereum allows the execution of turing-full programs, which are usually written in the programming language Solidity. According to the founders of the platform, this turns the system into a “global supercomputer.” These characteristics are useful in gambling applications, where user trust is especially important.

Ethereum blockchain is deterministic and therefore presents certain difficulties when writing a pseudo-random number generator (PRNG) - an integral part of any gambling application. We decided to investigate smart contracts in order to assess the safety of the PRNG for Solidity and to highlight the characteristic design errors that lead to the appearance of vulnerabilities and the ability to predict the future state of PRNG.

Our research was conducted in several stages:
')

On etherscan.io and GitHub information on 3649 smart contracts is collected.
Contracts were imported into the free Elasticsearch search engine.
Using the Kibana web interface for functional search and filtering, 72 unique PRNG implementations were found.
After a manual evaluation, 43 smart contracts are deemed vulnerable.

Vulnerable applications

The analysis revealed four categories of vulnerable PRNG:

PRNG using block variables as a source of entropy.
PRNG based on the hash of some previous block.
PRNG based on the hash of the previous block in combination with the supposedly secret initial number (seed).
PRNG affected front-running vulnerabilities.

Look at the examples of vulnerable code in each category.

PRNG using variable groups

Here are some block variables that are mistaken for the source of entropy:

block.coinbase is the address of the miner who found the current block.
block.difficulty is a relative indicator of the difficulty in mining the current block.
block.gaslimit - maximum gas consumption for transactions in the block.
block.number is the height of the current block.
block.timestamp is the timestamp when a block is found.

First of all, miners can manipulate all block variables, so for this reason alone they cannot be used as a source of entropy. More importantly, block variables are obviously identical within a block. So if an intruder’s contract accesses the victim’s contract through an internal message, then the same PRNG in both contracts will produce the same result.

Example 1 ( 0x80ddae5251047d6ceb29765f38fed1c0013004b7 ):

 // Won if block number is even // (note: this is a terrible source of randomness, please don't use this with real money) bool won = (block.number % 2) == 0;

Example 2 ( 0xa11e4ed59dc94e69612f3111942626ed513cb172 ):

 // Compute some *almost random* value for selecting winner from current transaction. var random = uint(sha3(block.timestamp)) % 2;

Example 3 ( 0xcC88937F325d1C6B97da0AFDbb4cA542EFA70870 ):

 address seed1 = contestants[uint(block.coinbase) % totalTickets].addr; address seed2 = contestants[uint(msg.sender) % totalTickets].addr; uint seed3 = block.difficulty; bytes32 randHash = keccak256(seed1, seed2, seed3); uint winningNumber = uint(randHash) % totalTickets; address winningAddress = contestants[winningNumber].addr;

GPSN on block hash

Each block in the Ethereum chain has a verification hash. Ethereum Virtual Machine (EVM) allows you to get these hashes using the block.blockhash() function. The function receives as input a numeric argument indicating the block number. In the course of this study, we found that the results of the execution of the block.blockhash() function are often incorrectly used in block.blockhash() implementations.

There are three main varieties of such vulnerable PRNGs:

block.blockhash(block.number) , hash of the current block.
block.blockhash(block.number - 1) , hash of the last block.
block.blockhash() , a hash of a block that is at least 256 blocks away from the current one.

Let's look at each of these cases.

block.blockhash (block.number)

The state variable block.number allows block.number to find out the height of the current block. When a miner takes a transaction that executes the contract code, the block.number variable of the future block with this transaction is known, so the contract can reliably get its value. However, at the time a transaction is executed in EVM, the hash of the block being created is not yet known for obvious reasons, and EVM will always return zero.

Some contracts incorrectly interpret the expression block.blockhash(block.number) . In these contracts, the hash of the current block is considered known at run time and is used as a source of entropy.

Example 1 ( 0xa65d59708838581520511d98fb8b5d1f76a96cad ):

 function deal(address player, uint8 cardNumber) internal returns (uint8) { uint b = block.number; uint timestamp = block.timestamp; return uint8(uint256(keccak256(block.blockhash(b), player, cardNumber, timestamp)) % 52); }

Example 2 ( https://github.com/axiomzen/eth-random/issues/3 ):

 function random(uint64 upper) public returns (uint64 randomNumber) { _seed = uint64(sha3(sha3(block.blockhash(block.number), _seed), now)); return _seed % upper; }

block.blockhash (block.number-1)

In some contracts, another version of the CREF is used based on the block hash: the hash of the previous block rather than the current one is taken there. Needless to say, such an approach is also unacceptable: an attacker can create an exploit contract with the same PRNG code in order to trigger a target contract through an internal message. Both contracts will have the same "random" numbers.

Example 1 ( 0xF767fCA8e65d03fE16D4e38810f5E5376c3372A8 ):

 //Generate random number between 0 & max uint256 constant private FACTOR = 1157920892373161954235709850086879078532699846656405640394575840079131296399; function rand(uint max) constant private returns (uint256 result){ uint256 factor = FACTOR * 100 / max; uint256 lastBlockNumber = block.number - 1; uint256 hashVal = uint256(block.blockhash(lastBlockNumber)); return uint256((uint256(hashVal) / factor)) % max; }

Future block hash

A better idea is to use the hash of some future block. You can implement this script as follows:

The player bets, and the office stores a block.number transaction.
When the contract is recalled, the player asks the office to announce the winning number.
The office retrieves the stored block.number from the storage and receives its hash, which is then used to generate a pseudo-random number.

This approach only works if one important requirement is met. Solidity documentation warns about the limit of block hashes that EVM can store:

For reasons of scalability, hashes are not available for all blocks. You can access hashes of only the last 256 blocks, and all other values will be zero.

Therefore, if there is no second call within 256 blocks with a hash check, then a pseudo-random number can be predicted in advance — the hash will be zero.

The most famous case of exploitation of this vulnerability is hacking the SmartBillions lottery . The contract did not check the age of the block.number , due to which 400 ETH went to an unknown player who waited 256 blocks before disclosing a predictable winning number.

Secret hash block hash

To increase entropy, some contracts use an additional seed number (seed), which is considered secret. One of the examples is the lottery Slotthereum. Here is the corresponding code:

 bytes32 _a = block.blockhash(block.number - pointer); for (uint i = 31; i >= 1; i--) { if ((uint8(_a[i]) >= 48) && (uint8(_a[i]) <= 57)) { return uint8(_a[i]) - 48; } }

The variable pointer is declared secret, that is, other contracts cannot access it. After each game, this variable is assigned a winning number from 1 to 9, and then it is used to offset the block.number when receiving a block hash.

The blockchain, which is transparent by nature, should not be used to store secrets in clear text. Although secret variables are protected from other contracts, you can get the contents of the off-line contract repository. For example, in the popular Ethereum client web3, there is an API method web3.eth.getStorageAt() that allows you to get storage records at specified indices.

Given this fact, it becomes a trivial task to extract the value of a secret variable from the contract repository and use it as an argument in the exploit code:

 function attack(address a, uint8 n) payable { Slotthereum target = Slotthereum(a); pointer = n; uint8 win = getNumber(getBlockHash(pointer)); target.placeBet.value(msg.value)(win, win); }

Transaction Ahead

To get the maximum reward, miners choose transactions to create a new block based on the cumulative gas (fuel) that each transaction spends. The order of transactions in the block is determined by the price of gas. The first will be the transaction with the maximum price of gas. So, by changing the price of gas, it is possible to achieve that the necessary transaction is executed before all the others in the current block. This can be a security problem - commonly called front-running, when the execution of a contract depends on its position in the block.

Consider the following example. The lottery uses an external oracle to obtain pseudo-random numbers, which are used to select the winner among the players who have bet in the current round. These numbers are sent in clear text. An attacker could watch a pool of pending transactions and wait for a number from an oracle. As soon as the transaction from the oracle appears in the transaction pool, the attacker makes a bet with a higher price of gas. The attacker's attack came last in the current round, but thanks to the highest price of gas, it will actually be executed before the oracle transaction, which will bring the player victory. Such a task was performed by the participants of the hacker contest ZeroNights ICO .

Another example of a contract that is subject to early transaction vulnerability is a game called Last is me! Every time a player buys a ticket, he takes the last place and the countdown of the timer begins. If no one buys a ticket for a certain number of blocks, then the last “ranked” player wins the jackpot. When the round is nearing completion, the attacker can observe the pool of transactions of other participants and assign the jackpot, setting a higher price for gas.

Safer PRNG

There are several approaches for implementing safer PRNG in the Ethereum blockchain:

External oracles
Signidice
The commit-disclosure scheme

External oracles: Oraclize

Oraclize is a service for distributed applications that establish a bridge between the blockchain and the external environment (the Internet). When using Oraclize, smart contracts can request data from the API on the web, such as currency rates, weather forecasts, stock quotes. One of the most famous use cases is the ability of Oraclize to work as PRNG. Some of the contracts that were analyzed during our work used Oraclize to get random numbers from random.org through the URL connector. This diagram is shown in Fig. one.

Fig. 1. Oraclize work pattern

The main disadvantage of this approach is centralization. Can we believe that the Oraclize daemon does not interfere with the results? Can we trust random.org and the entire infrastructure underlying this service? Although Oraclize verifies the results through the TLSNotary auditing service, it can only be used outside the chain of blocks - in the case of lotteries, only after the winner is announced. It is better to use Oraclize as a source of “random” data using Ledger evidence , which can be tested in a chain.

External oracles: BTCRelay

BTCRelay is a bridge between the chains of Ethereum and Bitcoin blocks. When using BTCRelay, smart contracts in the Ethereum blockchain can request hashes of future Bitcoin blocks and use them as a source of entropy. One of the projects that use BTCRelay as the PRNG is the Ethereum Lottery lottery.

The BTCRelay method is not protected from the problem of stimulating miners. Although here the barrier is higher than in the case of Ethereum blocks, but only because of the higher price of Bitcoin. So this approach reduces, but does not eliminate, the likelihood of fraud by the miners.

Signidice

Signidice is an algorithm based on cryptographic signatures. It can be used as a PRNG in smart contracts involving two parties: the player and the office. The algorithm works as follows:

The player bets by calling a smart contract.
The office sees the bid, signs it with its secret key and sends the signature to the smart contract.
The smart contract verifies the signature using a known public key.
This signature is then used to generate a random number.

Ethereum has a built-in ecrecover() function to check ECDSA signatures in a chain. However, ECDSA cannot be used in Signidice, since an office can manipulate input parameters (in particular, parameter k ) and so affect the resulting signature. Alexey Pertsev showed a demo of such fraud.

Fortunately, with the release of the hard fork of Metropolis, a modular exponentiation operator appeared. This allows for the implementation of RSA signature verification. Unlike ECDSA, it does not allow the manipulation of the input parameters to find a suitable signature.

The commit-disclosure scheme

As the name implies, the commit-reveal scheme (commit – reveal) consists of two stages:

The commit stage is when the parties send their cryptographically protected secrets to the smart contract.
The disclosure stage, when the parties declare the initial numbers in clear text, the smart contract verifies that they are correct, and they are used to generate a random number.

A properly implemented commit-disclosure scheme should not rely on a single side. Although players do not know the original starting number submitted by the owner, and their chances are equal, the owner can also be a player, so players cannot trust him.

The commit-disclosure scheme is more correctly implemented in the Randao service. The PRNG collects hashes of initial numbers from several parties, and each of them receives a reward for participation. No one knows someone else's initial numbers, so the result is absolutely random. However, if at least one party refuses to announce its initial number, the service fails.

The commit-disclosure scheme can be combined with the use of future block hashes. In this case, three sources of entropy are involved:

owner's sha3 (seed1)
sha3 player (seed2)
future block hash

Then a random number is generated as follows: sha3(seed1, seed2, blockhash) . Therefore, the “commit-disclosure” scheme solves the problem of stimulating miners: the miner may affect the block hash, but he does not know the initial numbers of the owner and the player. It also solves the problem of stimulating the owner: he knows only his own initial number, but does not know the initial number of the player and the hash of the future block. In addition, this scheme is suitable for situations where a person acts as both the owner and the miner: he determines the hash of the block, knows the initial number of the owner, but does not know the initial number of the player.

Conclusion

The secure implementation of PRNG in the Ethereum blockchain is still an unsolved task. As our research has shown, due to the lack of ready-made solutions, developers tend to implement their own PRNG implementations. But it’s easy to make a mistake, since there are few entropy sources in the block chain. When developing the PRNG, the developer should make sure that he understands the motivation of each party - and then choose the appropriate approach.

Source: https://habr.com/ru/post/348838/

All Articles