For two weeks already, Runet has been making noises about Telegram and the situation with its senseless and merciless blocking by Roskomnadzor. The rebound has touched many, but all of these are topics for posts on Geektimes. I was surprised by something else - I still haven’t seen a single parsing of the planned TON - Telegram Open Network based on the Telegram network on Habré. I wanted to fill this shortcoming, because there is something to study there - even despite the absence of official statements about it.
Let me remind you - it is rumored that Telegram launched a very large-scale closed ICO, having already collected incredible amounts in it. It is assumed that already this year Gram will launch its own cryptocurrency - and each Telegram user will automatically have a wallet, which in itself creates a considerable advantage over the other cryptocurrencies.
Unfortunately, since there are no official statements, then I can only build on a document of unknown origin , which I immediately warn you about. Of course, it can turn out to be a very skillful fake, but it is also possible that this is the real white paper of the future system, written by Nikolay Durov (and probably merged by some of the investors). But even if it is a fake, no one will forbid us to study and discuss it, right?
What does this document say? I will try to retell it in my own words, close to the text, but in Russian and a little more humane (forgive me, Nikolai with his inclination to go into formal mathematics). Keep in mind that even if it is authentic, this is a rough description of the system and it is very likely to change by the time of public launch.
We learn that in addition to cryptocurrency, there is still a lot of things assumed. Let's sort it out in order.
This is the first part, describing the "down to earth" level of TON - its network part, built on top of traditional protocols. In the next part we will talk about the "pulp" - the blockchain, which will be supported by the system described below. Thus, my order of retelling is somewhat different from that used in the aforementioned document (which starts immediately from the abstract level).
TL (Type Language). This is an abstract binary format for arbitrary data structures. It is used in the Telegram protocol and will be actively used in TON. If you want to get acquainted with it in detail - here is its description .
Hash . A function that performs an irreversible transformation of an arbitrary data structure into a single number of fixed length. Within the framework of the documentation, we are talking about the function SHA-256 .
Network node ( node ). A node is software that will provide system operation. In particular, it is assumed that each Telegram client application will include a TON node. At a low level, the nodes have IPv4 / IPv6 addresses and communicate using the UDP protocol, at a higher level they have abstract addresses and implement the ADNL protocol (about abstract addresses and ADNL - see below). When it comes to the fact that some parts of the system do something or store some data, it is implied that the network nodes do it.
Abstract address (or just the address ). The node address is determined by its public key. More strictly, this is a 256-bit hash (SHA256) from a data structure containing a public key (the specific cryptographic algorithm is not specified; the elliptic curves and RSA-2048 are given as an example). In order for one node to interact with another, it needs to know not only the address of that, but also this data structure. Theoretically, a single physical node can create any number of addresses (corresponding to different keys).
Further, such a bundle is often used: a “preimage” in the form of a TL structure (containing almost any data), and a 256-bit hash from it, used for addressing.
Blockchain A blockchain is a data structure whose elements ( blocks ) are arranged in a “chain”, and each next block of the chain contains the hash of the previous one. Thus integrity is achieved - changes can be made only by adding new blocks.
Service ( service ). Services within TON can be of different types - depending on whether they use the blockchain or not. For example, one (or many) of the network nodes can process some RPC requests using the ADNL protocol described below without creating any entries in the blockchain - like traditional web servers. This includes the possibility of implementing HTTP over ADNL, as well as the transition of the messenger itself to this protocol. By analogy with TOR or I2P, this will make it more resistant to various locks.
At the same time, a number of services include interaction with the blockchain, and processing requests outside of it. For example, for TON Storage - file storage - it is not very reasonable to store the files themselves in the blockchain. It will contain only file hashes (along with some meta-information about them), and specialized “network servers” ready to give them to other nodes via ADNL will act as “file servers”.
Fog service . We are talking about some services that involve decentralization and open participation in them. For example, TON Proxy is a service that can be supported by any participant who wants to provide his or her node as an intermediary (proxy) that sends packets between other nodes. If desired, he can charge for it the fee set by him - using the TON Payments system for micropayments (which, in turn, is also a foggy service).
At the lowest level, the interaction between the nodes will be made over the UDP protocol (although other options are acceptable).
As mentioned above, in order for one node to send a packet to another, it must know one of its public keys (and, therefore, the address that it defines). It encrypts the packet with this key and adds a 256-bit recipient address to the beginning of the packet — since one node can have several such addresses, this will allow it to determine which key to use for decryption.
In addition, instead of the recipient's address at the beginning of the data packet can be so-called. channel id In this case, packet processing is already dependent on specific agreements between the nodes — for example, the data sent to a certain channel can be assigned to another node and must be forwarded to it (this is the TON Proxy service). Another special case may be the interaction directly between the nodes, but with encryption on an individual key pair for this channel (previously formed according to the Diffie-Hellman protocol).
Finally, a special case is the “null” channel — if the node does not yet know the public keys of its “neighbors,” it can send them packets without encryption at all. This is intended only for initialization - as soon as nodes send information about their keys, they should be used for further interaction.
The above protocol (256 bits of channel identifier + packet contents) is called ADNL. The documentation mentions the possibility of implementing a TCP analogue on top of it or its own add-on - RLDP (Reliable Large Datagram Protocol), but does not go into details about their implementation.
As in the case of other distributed systems, TON involves the implementation of DHT - a distributed hash table . More specifically, the table is Kademlia-like . If you are not familiar with this kind of hash tables - do not worry, then I will describe approximately how they work.
In an abstract sense, DHT assigns some binary values ​​of arbitrary length to 256-bit keys. In this case, the keys in the table are hashes from a certain TL structure (the structures themselves are also stored with the DHT). This is very similar to the formation of node addresses - and they may indeed be present in the DHT (for example, the IP address of the node corresponding to a given abstract address , if it does not hide it, can be located on this key). But in general, “key types” (their descriptions , key descriptions ) are metadata that indicate the “owner” of the entry in the hash table (that is, the public key of a node), the type of stored value and the rules by which this entry may change afterwards. For example, a rule may allow changing the value only to the owner - or prohibiting a change in the value to the lower side (to protect against replay attacks).
In addition to the 256-bit keys, the concept of DHT addresses is introduced. The difference with the usual node addresses is that the DHT address is necessarily bound to the IP address. If the host does not hide its IP, it can use the normal address for DHT. But more often for the needs of DHT will have a separate, "semi-permanent" address.
Above the keys and DHT addresses, the concept of distance is introduced - this is where everything coincides with the Kademlia tables - the distance between the keys is equal to XOR (bitwise exclusive OR) from them. As in the Kademlia tables, the value corresponding to a certain key should be stored on s nodes that have the shortest distance to this key ( s here is a relatively small number).
In order for the DHT node to interact with other such nodes, it keeps in memory the routing table DHT - DHT- and the IP addresses of the nodes with which it interacted before, grouped by distance from them. There are 256 such groups (they correspond to the highest bit set in the distance value - that is, nodes at a distance from 0 to 255 fall into one group, from 256 to 65535 - into the next, etc.). Within each group there is a limited number of “best” nodes (in terms of ping before them).
Each node must support several operations: saving the value for the key , searching for nodes and searching for values . The search for nodes involves issuing, according to a given key, the nodes closest to it from the routing table; the search for values ​​is the same, except for the situation when the node knows the value for the key (then it simply returns it). Accordingly, if a node wants to find a value in a DHT by key, it sends requests to a small number of nodes closest to that key from its routing table. If among their answers there is no desired value, but there are other addresses of nodes, then the request is repeated already to them.
TON DHT can be used for various purposes, for example - to implement torrent-like file storage (see TON Storage ); to determine the addresses of nodes implementing certain services; to store information about account holders in the blockchain. But the most important application is the discovery of nodes by their abstract addresses. For this, the address is used as the key whose value is to be found. As a result of the request, either the node itself will be found (if the address sought was its half-permanent DHT address), or the value is the IP address and port for connection - or another address that should be used as an intermediary tunnel.
The ADNL protocol described above implies the ability of any nodes to exchange information with each other - though not necessarily the best way. We can say that thanks to ADNL all nodes form a global TON graph (ideally, connected). But the possibility of creating overlay networks — subgraphs within this graph, is additionally provided.
Within such a network, interaction is performed only directly - through pre-formed links between the nodes participating in the network (via the ADNL channels described above). The formation of such connections between neighbors, the search for the neighbors themselves is an automatic process that seeks to preserve the connectivity of the overlay network and minimize delays in the exchange of data in it.
In addition, there is a way to quickly distribute large broadcast updates within the network - they are broken into pieces, complemented by an error correction code, and all these pieces are sent from one participant to another. Thus, it is not necessary for the participant to fully receive all the parts before sending them further down the network.
Overlay networks can be public and private. It’s easy to become a member of a public network - you need to find a TL structure that describes it (it can be public - or it can be accessed with a certain key in the DHT). In the case of a private network, this structure must be known to the node in advance.
I decided to split the TON review into several articles. At this point, this part ends, and in the next I turn to the consideration of the structure of the blockchain (more precisely, blockchains), of which TON will consist.
Source: https://habr.com/ru/post/354366/