📜 ⬆️ ⬇️

Part 1. Where to store data for decentralized applications on the blockchain?

Now there is a boom blockchain projects. Some blockchains are so powerful that they are a platform for writing applications. Applications are automatically decentralized, resistant to censorship and blocking. But is everything really that good and simple? In this article we will try to look at the blockchain as a platform for applications, removing rose-colored glasses.

And what is this blockchain?

A blockchain is a immutable data structure consisting of a list of blocks, where each next block contains a hash of the previous block. As a result of such hashing, the block chain becomes immutable: you cannot change or remove a block from the middle of the chain without rebuilding all blocks above, because the slightest change will require rebuilding (recounting hashes) of all blocks above the change.

If we make another calculation of the hash of each block by a computationally or economically complex operation, then changing the data in the middle of the circuit becomes almost impossible at all. The combination of the difficulty of calculating the hash of a new block, as well as the ease of verifying the correctness of the hash, provides the blockchain with a serious resistance to illegal changes. This is what keeps Bitcoin and other blockchains safe.
')
Thanks to this blockchain property, projects can be publicly decentralized. That is, anyone can put a working blockchain node and generate new blocks. In most blockchain implementations, a reward is given for generating a block — this process is called mining. And since mining is difficult, and your results can be easily verified, it is beneficial to act only honestly. Otherwise, you will spend resources on mining, and the other miners will not accept your block - all work is for nothing. Thus, with complete decentralization and independence of individual nodes, the network of blockchains works as a single unit.

But ok, let's say one dishonest miner is easy to calculate and ignore. But what if there are a lot of them, and they conspired? Imagine that all the people around you consider the red light to be green. :) And they look at you as if you think otherwise. Social experiments show that most people in such a situation begin to doubt and join the majority opinion. But the majority rule works in the blockchain!

The similar problem of finding out the truth in the conditions when your interlocutors can lie shamelessly was named Leslie Lamport “The Problem of the Byzantine Generals”, and was solved two years earlier in 1980 by him jointly with other authors. It was shown that with n spies who can lie and distort information, a consensus among the participants can be achieved with a total number of participants 3 n +1. And if we ensure that spies cannot distort the messages transmitted through them, then 2 n +1 is sufficient. In the blockchain, due to the electronic signature, malicious nodes cannot distort information, so if there are less than half of the harmful nodes in the blockchain, then the network is stable.

Resilience of the network to malicious nodes is called resistance to the Byzantine problem (Byzantine Fault Tolerance, BFT). BFT is very important for public network systems to which arbitrary nodes can be added freely. Such systems are the majority of projects on the blockchain.

The use of the blockchain is not limited to the creation of cryptocurrency. Inside the block, you can write anything. In Bitcoin , a list of new transactions is recorded there, and this is used to exchange cryptocurrency between its owners. NameCoin blocks arbitrary key-value pairs in blocks, which can be used to create decentralized DNS. In other implementations of the blockchain, some more chips are used. But Ethereum went much further. It allows you to store in the blockchain not only transactions, but also full-fledged Turing-complete programs, called smart contracts, which allow you to fine-tune the blockchain for an application task. For example, NameCoin is implemented on Ethereum with 5 lines of code .

Ethereum was conceived as a universal platform for creating decentralized projects based on the blockchain. Why re-implement the entire blockchain, deploy your own infrastructure, if it is possible with a couple of smart contracts to implement what you need on Ethereum, such as, for example, the analogue of NameCoin? Therefore, the last time Ethereum is experiencing rapid growth. Since March 2017, ETH (Ethereum cryptocurrency) has increased in price five times in just two months, and the growth continues. There are already hundreds of applications working on Ethereum, for example, the AKASHA social network , the Ethlance freelancers exchange , a word game , and there are a lot of them!

The smart contract blockchain provides the entire infrastructure for applications. Applications have code executed in blockchain in smart contracts. Applications can store any information in the blockchain, transferring it to their smart contracts as data. Applications can read this information from the blockchain, because the state of the Ethereum blockchain is, in fact, a key-value database.

It would seem, what else is needed? Applications are really decentralized, not subject to censorship and prohibition. In general, the blockchain is a solid virtue! But if everything was so good ... When creating truly powerful applications, shortcomings are immediately detected.

Immutability Immutability is, of course, good. It is immutability that gives blockchain publicity and BFT. However, there is a downside. All data that applications write in the blockchain will remain there forever. Played in words - the blockchain remembered it. Placed the information in the social network - it is permanently stored in the blockchain, even if you later deleted your profile. The explosive growth in the number of applications on the blockchain leads to a strong bloating of a chain of blocks in size. Already, the size of the full blockchain Ethereum has exceeded 130GB, although it has been operating for less than 2 years. Bitcoin is less with its solid age of more than 7 years.

Of course, in some implementations of Ethereum include the State Tree Pruning technology , which allows you to store only the last state of the blockchain, with a limited history of about a day, which currently reduces the stored information by 20 times. For example, go-ethereum full node requires 130 GB of blockchain for storage, and Parity with this technology support - only 6 GB. However, given that the growth in the number of applications is just beginning, and each Ethereum node has to store all the data of all applications, it looks as though necessary, but only to postpone the problem. As the size of the blockchain grows, it will no longer fit on mass-produced hard drives, its maintenance will become affordable only for large organizations, which leads to dangerous centralization - concentration of control over more than 50% of the network from one organization. This may disrupt BFT.

The slowness of the transaction . The blockchains pay for their own transaction speed. Bitcoin has 7 transactions per second, Ethereum has 15 transactions. And this is all over the network, because each node completely replicates the other nodes. Adding a new node increases the stability of the system, but in no way increases the speed of its operation or the maximum amount of data storage. That is, a change in data (and every change in data in a blockchain is a transaction) is a bottleneck. Popular applications will immediately come across this limitation.

Primitive data storage . Given that the state of the blockchain is already a key-value database, it is rather primitive. Search is possible only by the primary key, the amount of stored data is very limited. For serious applications, this is clearly not enough.

Thus, when developing applications on blockchains, for example, for Ethereum, the problem of data storage is very acute. Now there are no satisfactory ways to solve it.

But after all, existing applications, for example, AKASHA, are somehow twisted ... In the next part, we will look at existing approaches to solving this problem.

→ The second part of the article
→ The third part of the article

Source: https://habr.com/ru/post/327836/


All Articles