📜 ⬆️ ⬇️

New Clodo Zoned Storage System

New Clodo cloud storage architecture is designed to provide customers with the maximum speed of reading and writing data and increase the reliability of the storage. Below is a brief description of how and, most importantly, why it is so arranged.

The storage system is definitely one of the most important elements of the cloud. It must be independent of the compute nodes. In addition, it should be fast and reliable. Finally, it must be fault tolerant. The previous version of our storage system that used the IBM GPFS file system proved to be a non-optimal solution for use in the cloud in terms of speed. In addition, GPFS was a single point of failure, the failure of which could in the worst case lead to the inability of the cloud as a whole. We also left this vicious practice.

The basic concept of a new Clodo storage system is “zone”. We have divided all the cloud into “zones”, absolutely independent of each other. Zones are relatively small and contain up to 10 XEN nodes (compute servers with resources that users' virtual machines use). Each zone has its own caching system, independent of other data storage and transmission systems.
')

The composition of the zone


The diagram shows the architecture of a single zone.


As can be seen from the diagram, the storage system implies four levels of data caching: one per user virtual machine and compute node and two at the storage system node.

The architecture of the storage nodes is chosen to optimize the most complex operation - a random entry. In order that reading was also carried out quickly, the adaptive version of the readahead technology is used.

Storage Characteristics


Below is a graph of the dependence of the time during which the storage system maintains circulation at a given speed against the velocity of circulation.



Cache parameters were specially designed so that users stay in the first section of the diagram as long as possible and work with the storage system at maximum speed.

This graph expresses the “worst case” - the fact that customers will get guaranteed, if they fill the zone completely, they will simultaneously begin to use the disk intensively with the same caches. In reality, all or almost all users are in the first section of the diagram.

As user activity increases, we will increase the amount of memory in the caches so that this situation always remains with the customers.

Zoning benefits


Splitting the entire cloud into several zones has several advantages.

First, it allowed us to implement multi-criteria optimization for the following set of parameters: the speed of the storage system, the reliability of the storage system and the cost of storage for the consumer. Having one very large zone instead of a set of small ones, such optimization cannot be done. In addition, the overhead of ensuring the functioning of a system like that described greatly increases with its size — by fixing its dimensions, we accelerate it.

Secondly, zoning allows you to get away from a single point of failure, since the zones are completely independent.

Thirdly, we have the ability to balance the load on the cloud, distributing clients by zones depending on the activity of reading and writing to disk, channel consumption and the amount of RAM used. Live migration is supported between zones.

All new virtual machines are created on the new data storage system. Some of the old customers have already been transferred to the new storage system. Gradually all other customers will be transferred. If you are a Clodo client and want to switch to a new system, simply apply for technical support.

In conclusion, we would like to bring graphics speed of disk operations. On charts, Clodo is compared to “regular” cloud hosting. Like the "normal" hosting, for the time of testing, we turned off the shaper, which operates under normal conditions.



PS Anyone who wants to “move” to Clodo from a “normal” cloud or non-cloud hosting service can rely on qualified and free support in setting up a virtual server and transferring data.

Source: https://habr.com/ru/post/119082/


All Articles