“Cryptography in blockchains”: about hash functions, keys and digital signatures

Cryptography is the heart of the blockchain, which ensures the operation of the system. The architecture of the blockchain assumes that trust between the participants of the network is based on the principles of mathematics and economics, that is, it is formalized. Cryptography also guarantees security, moreover, based on the transparency and verifiability of all operations, and not on the industry's traditional system visibility limit (perimeter security).

Various cryptographic techniques guarantee the immutability of the blockchain's transaction log, solve the authentication task, and control access to the network and data in the blockchain as a whole. In today's article we will talk about hash functions, keys and digital signatures.

/ image BTC Keychain CC
')

Hash functions

Hashing is the process of converting an input array of arbitrary length into a (output) bit string of fixed length. For example, a hash function can take a string with any number of characters (one letter or a whole literary work), and at the output to get a string with a strictly defined number of characters (digest).

Hash functions are available in almost any programming language. For example, they are used to implement hash tables and sets (HashMap / HashSet in Java, dict and set in Python, Map, Set and objects in JavaScript, and so on). A separate category of hash functions is cryptographic hash functions . They are subject to significantly more stringent requirements than to the functions commonly used in hash tables. Therefore, they are used in more "serious" cases, for example for storing passwords . Cryptographic hash functions are developed and thoroughly tested by researchers around the world.

You can experiment with hash functions by writing a simple Python program:

import hashlib def hash_hex(message): return hashlib.sha256(message.encode()).hexdigest()

The hash_hex () function calculates the hash representation in hexadecimal for a string. In the above example, the function SHA-256 is used - the same as in Bitcoin.

A good hash function provides protection against collisions (it is impossible to get two identical hashes with different initial data) and has a so -called avalanche effect, when the slightest change in input data significantly converts the output value. The avalanche effect in the SHA-256 hash function is as follows:

 >>> hash_hex('Blockchain') '625da44e4eaf58d61cf048d168aa6f5e492dea166d8bb54ec06c30de07db57e1' >>> hash_hex('blockchain') 'ef7797e13d3a75526946a3bcf00daec9fc9c9c4d51ddc7cc5df888f74dd434d1' >>> hash_hex('Bl0ckchain') '511429398e2213603f4e5dd3fff1f989447c52162b0e0a28fe049288359220fc'

Hash functions in blockchains guarantee the “irreversibility” of the entire chain of transactions. The fact is that each new transaction block refers to the hash of the previous block in the registry. The hash of the block itself depends on all transactions in the block, but instead of sequentially passing the transaction to the hash functions, they are collected into one hash value using a binary tree with hashes (Merkle tree). Thus, hashes are used as a replacement for pointers in ordinary data structures: linked lists and binary trees.

Through the use of hashes, the general state of the blockchain — all transactions ever performed and their sequence — can be expressed in one single number: the hash of the newest block. Therefore, the property of the immutability of the hash of one block guarantees the immutability of the entire blockchain.

Below is the recursive implementation of the Merkle tree used in Bitcoin in Python (see the link for examples of work). The input of the function is a list of transaction hashes. At each stage of the calculation, successive pairs of hashes are glued together using a hash function; if the hashes are an odd number, the latter is duplicated. As a result, the only hash remains, which is the final hash value for the entire list.

 import hashlib def merkle_root(lst): #        SHA-256   #  . ,    . sha256d = lambda x: hashlib.sha256(hashlib.sha256(x).digest()).digest() hash_pair = lambda x, y: sha256d(x[::-1] + y[::-1])[::-1] if len(lst) == 1: return lst[0] #         - # ,           . #         , #    : # https://github.com/bitcoin/bitcoin/blob/master/src/consensus/merkle.cpp#L9 if len(lst) % 2 == 1: lst.append(lst[-1]) return merkle_root([ hash_pair(x, y) for x, y in zip(*[iter(lst)] * 2) ])

Hash trees have many uses besides blockchains. They are used in file systems to verify the integrity of files distributed by databases for quick synchronization of copies and in key management for reliable journaling of certificate issuance. Git uses hash-tree generalization - hash-based directed acyclic graphs. In the blockchain, the use of hash trees is dictated by performance considerations, since they make possible the existence of “light clients” who process only a small part of blockchain transactions.

Digital signatures

Digital signatures in blockchains are based on public key cryptography. It uses two keys. The first one — the private key — is needed to generate digital signatures and is kept secret. The second, the public key, is used to verify the electronic signature. It is realistic to calculate the public key on the basis of the private key, but the inverse transformation requires an amount of computation impossible in practice that is comparable to brute force.

There are many different public key cryptography schemes. The two most popular of them are schemes based on factorization (RSA) and schemes based on elliptic curves. The latter are more popular in blockchains because of the smaller size of keys and signatures. For example, in Bitcoin, the standard of elliptic cryptography ECDSA is used together with the elliptic curve secp256k1. In it, the private key is 32 bytes in length, the open one is 33 bytes, and the signature is about 70 bytes.

The general idea of public key signatures is as follows. Suppose Alice wants to translate one bitcoin to Bob. To do this, she forms a transaction, where she writes down where it should be taken from (indicating the previous transaction in which Alice received Bitcoin from someone else) and to whom to send (Bob’s public key). Alice knows Bob’s public key from third-party sources — Bob can send it to Alice via an instant messenger or even publish it on the site.

Alice then signs the transaction using her private key. Any node in the bitcoin network can verify that the transaction is signed by a specific public key (authentication) with which one bitcoin (authorization) was associated with the transaction. If these conditions are met, then the translated bitcoin begins to be associated with the public key of Bob.

Since there is no central node in the blockchain that can authorize arbitrary transactions, the security of the system becomes decentralized, and the probability of successful intervention in the work of the blockchain is reduced to almost zero.

Thus, the blockchain uses digital signatures to authenticate and ensure the integrity of transactions (and sometimes blocks). In the case of cryptocurrency, the authentication process means that only the person to whom they were sent by another, earlier transaction can spend the funds. The peculiarity of the blockchain is that the authentication information is “embedded” in each transaction, and not separated from the business logic, so the blockchain is considered more secure. In a conventional system, you can hack or administratively bypass the authentication mechanism and manipulate the backend, but in the blockchain, this cannot be done by definition.

PS In our next posts, we plan to touch on such things as smart contracts and consensus algorithms, and also talk about what the spread of quantum computers for the blockchain will mean .

PPS Some additional sources:

Our English blog

Source: https://habr.com/ru/post/327272/

All Articles

“Cryptography in blockchains”: about hash functions, keys and digital signatures

Hash functions

Digital signatures

More articles: