TON: Telegram Open Network. Part 2: Blockchains, Sharding

TON

This text is a continuation of a series of articles in which I look at the structure of the (supposedly) upcoming release of the distributed Telegram Open Network (TON) this year. In the previous part I described its most basic level - the way the nodes interact with each other.

Just in case, let me remind you that I have nothing to do with the development of this network and all the material is drawn from an open (albeit untested) source - a document (there is also an attached brochure summarizing the main points) that appeared late last year. The amount of information in this document, in my opinion, testifies to its authenticity, although there is no official confirmation of this.

Today we will look at the main component of TON - blockchain.

Basic concepts

Account ( account ). A certain data set identified by a 256-bit account_id number (most often it is the account owner’s public key). In the base case (see zero workflow below), this data means the user's balance. Anyone can “borrow” a specific account_id , but its value can be changed only according to certain rules.

Smart contract ( smart contract ). In essence, this is a special case of an account, supplemented by a smart contract code and a repository of its variables. If in the case of a “wallet” it is possible to transfer and debit money from it according to relatively simple and predefined rules, in the case of a smart contract these rules are written in the form of its code (in some Turing-full programming language).

The state of the blockchain ( state of blockchain ). The set of states of all accounts / smart contracts (in the abstract sense, a hash table, where the keys are the account identifiers, and the values are the data stored in the accounts).

Message ( message ). Above, I used the expression “crediting and debiting money” - this is a particular example of a message (“transfer N grams from account_1 to account_2 ”). Obviously, only a node that owns the account_1 account private key can send such a message - and is able to confirm this with a signature. The result of delivering such messages to a regular account is an increase in its balance, and a smart contract will execute its code (which will process the reception of the message). Of course, other messages are possible (transferring not monetary amounts, but arbitrary data between smart contracts).

Transaction . The fact of message delivery is called a transaction. Transactions change the state of the blockchain. It is from the transaction (message delivery records) that the blocks in the blockchain are composed. In this regard, it is possible to imagine the state of the blockchain as an incremental database — all blocks are “diffs” that must be applied sequentially to get the current state of the database. On the specifics of the packaging of these "diffs" (and the restoration of the full state of them) will be discussed in the next article.

Blockchain in TON: what is it and why?

As mentioned in the previous article, the blockchain is a data structure, the elements (blocks) of which are arranged in a “chain”, and each next block of the chain contains the hash of the previous one . In the comments asked the question: why do we need such a data structure when we already have a DHT - a distributed hash table? Obviously, some data can be stored in the DHT, but this is only suitable for not too "sensitive" information. Cryptocurrency balances cannot be stored in DHT - primarily due to the lack of integrity checks. Actually, the entire complexity of the blockchain structure grows in order to prevent interventions in the data stored in it.

However, the blockchain in TON looks even more complicated than in most other distributed systems - and there are two reasons for this. The first is the desire to minimize the need for forks . In traditional cryptocurrencies, all parameters are set at the initial stage and any attempt to change them leads in fact to the appearance of an “alternative ~~of the universe~~ cryptocurrency. The second reason is the support of crushing ( sharding , sharding ) of the blockchain. Blockchain is a structure that is not capable of becoming smaller over time; and usually each node responsible for the network operability is forced to store it fully. In traditional (centralized) systems, sharding is used to solve such problems: some records in the database are on one server, some are on another, etc. In the case of cryptocurrencies, such functionality is still quite rare - in particular, due to the fact that it is difficult to add sharding to the system where it was not originally planned.

So how does TON plan to solve both of the above problems?

Content blockchain. Workchains

Blockchain

First of all, let's talk about what is planned to be stored in the blockchain. The status of accounts (“wallets” in the base case) and smart contracts will be stored there (for simplicity, we will assume that this is the same as accounts). In essence, this will be a regular hash table — the keys in it will be the account_id identifiers, and the values will be data structures containing such things as:

balance;
smart contract code (only for smart contracts);
smart contract data store (smart contract only);
statistics;
( optional ) public key for transfers from the account, by default account_id;
the queue of outgoing messages (here they are recorded for forwarding to the recipient);
list of recent posts delivered to this account.

As mentioned above, the blocks themselves consist of transactions — messages delivered to various account_id accounts. However, besides the account_id, the messages also contain a 32-bit field workchain_id - a so-called identifier. Workchain ( workingchain , working blockchain ). This allows you to have several independent blockchains with different configurations. At the same time, workchain_id = 0 is considered a special case, zero workout - it is the balances in it that correspond to the cryptocurrency TON (Grams). Most likely, at first, the other work-makers will not exist at all.

Shardchains Infinite Sharding Paradigm.

But this growth in the number of blockchains does not stop. Let's deal with sharding. Imagine that each account (account_id) is allocated its own blockchain - it contains all the messages that come to it - and the states of all such blockchains are stored on separate nodes.

Of course, this is very wasteful: most likely, each of these shardchains ( shardchain , shard blockchain ) will receive transactions very rarely, and you will need a lot of powerful nodes (looking ahead, I’ll note that this is not just about clients on mobile phones - but about serious servers).

Therefore, shardchains combine accounts with binary prefixes of their identifiers: if a shardchain has the prefix 0110, then all account_id transactions that start with these digits will fall into it. This shard_prefix can be from 0 to 60 bits long - and most importantly, it can change dynamically.

Shardchains

As soon as one of the shardchains begins to receive too many transactions, the nodes working on it “split” it into two children by predetermined rules - their prefixes will be one bit longer (and for one of them this bit will be 0, and for the other) - one). For example, shard_prefix = 0110 b splits into 0110 0b and 0110 1b. In turn, if the two “neighboring” Shardchains begin to feel quite at ease (for some time), they will merge again.

Thus, sharding is done “from the bottom up” - we assume that each account has its own shard, but for the time being they are “glued together” by prefixes. This implies the Infinite Sharding Paradigm ( paradigm of infinite sharding ).

Separately, I would like to emphasize that workbains exist only virtually - in fact, workchain_id is part of the identifier of a particular shardchain. In formal terms, each shardchain is defined by a pair of numbers ( workchain_id , shard_prefix ).

Error correction. Vertical blockchains.

Traditionally, it is believed that any transaction in the blockchain is “carved in stone”. However, in the case of TON, it is possible to “rewrite history” - if someone (the so-called “fisherman” node ) proves that one of the blocks was incorrectly signed. In this case, a special correction block is added to the corresponding shardchain, which contains the hash of the corrected block itself (and not the last block in the shardchain). Representing shardchain as a chain of blocks laid out horizontally, it can be said that the correction block does not fit the wrong block to the right, but from above — therefore, it is considered that it becomes part of the small “vertical blockchain”. Thus, it can be said that shardchains are two-dimensional blockchains .

Vertical blockchain

If, after an erroneous block, the subsequent blocks were referred to by the changes (ie, new transactions were made based on invalid ones), corrective ones are also added to these blocks “from the top”. If the blocks did not affect the "affected" information, these "corrective waves" do not apply to them. For example, in the illustration above, the first block transaction was recognized as incorrect, increasing the balance of the account C - therefore, the transaction reducing the balance of this account in the third block should also be canceled, and the correction block should be closed on top of the block itself.

It should be noted - although the corrective blocks are depicted as being located “above” the original ones, in fact they will be added to the end of the corresponding blockchain (where they should be chronologically). The two-dimensional location only shows to which point in the blockchain they will be “picked up” (by means of the hash of the original block in them).

You can individually philosophize about how good is the decision to "change the past." It would seem that if we admit the possibility of the appearance of an incorrect block in a shardchain, then it is impossible not to allow the possibility of the appearance of an erroneous correction block. Here, as far as I can tell, the difference in the number of nodes, which should reach a consensus on the new blocks — a relatively small “ working group ” of nodes (often changing its composition) will work on each Shardchain, and the introduction of corrective blocks will require the agreement of all nodes in general -validators . In more detail about validators, working groups and other roles of nodes I will tell in the next article.

One blockchain to rule everyone

The above is a lot of information about various types of blockchains, which itself should also be stored somewhere. In particular, talking about the following information:

on the number and configurations of workrchins;
about the number of shardchains and their prefixes;
which nodes are currently responsible for which shardchains;
hashes of the last added blocks in all shardchains.

As you might have guessed, all these things are recorded in one more blockchain vault — the masterchain ( masterchain ). Due to the presence of hash blocks of all shardchains in its blocks, it makes the system strongly connected. This means that the generation of a new block in the master will occur immediately after the blocks are generated in the shardchains - it is expected that the blocks in the shardchains will appear almost simultaneously approximately every 5 seconds, and the next block in the masterchild will appear a second later.

But who will be responsible for the implementation of all this titanic work - for sending messages, executing smart contracts, forming blocks in shardchains and masterchains, and even checking blocks for errors? Will it all be possible to secretly make the phones of millions of users with Telegram's client installed on them? Or, perhaps, the team of the Durovs will abandon the ideas of decentralization and their servers will do it in the old-fashioned way?

In fact, neither the one nor the other answer is correct. But the fields of this article are rapidly ending, so the conversation about the different roles of the nodes (you might already notice the mention of some of them), as well as the mechanics of their work, will go in the next section.

Source: https://habr.com/ru/post/354568/

All Articles