For all the time the existence of the theory of computers and systems, only one statement remained true: processors are much more productive and expensive than data storage devices. The fact that the CPU is capable of serving many storage devices at once has had a significant impact on the development of hardware and software for systems of various sizes.
Indeed, in books such as Computer Systems: A Programmer's View (Computer Systems: A Programmer's Perspective) by Randal Bryant and David O'Hallaron, the emphasis is on the hierarchy of memory and its effect on developed programs. ')
However, data centers and software developers need to prepare for future changes. The emergence of high-speed non-volatile storage devices, commonly called the abbreviation SCM (Storage Class Memories), will shake the usual foundations. SCMs are gradually gaining popularity, but working with them requires allocating one or several multi-core processors to cope with their performance (hundreds of thousands of IOPS).
The speed of long-term storage has always been much lower than the speed of the CPU, and this difference only increased from the beginning of the 90s to the beginning of the 00s. Processors steadily improved and improved, and the performance of mechanical disks remained unchanged - development was hindered by physics. For decades, in order to reduce this gap and avoid processor downtime, various schemes and techniques have been invented.
One way is caching. In modern systems, caching is performed at all system levels: the processor caches RAM, operating systems caches entire disk sectors, and so on.
Other methods literally allow processor time to be spent on performance. For example, compression and deduplication reduce the size of the processed data, and it turns out that the “fast” memory increases in size, but you have to pay for it with computational resources.
Compression remains the primary technique used in enterprise-level storage systems, as well as large data environments. Tools like Apache Parquet reorganize and compress data on disks to reduce read time.
From all these shortcomings freed flash storage. This technology is not new, and SAS and SATA SSD can be purchased for ten years as. However, SCM takes flash devices to a new level: flash memory is connected to the PCIe bus, instead of the slow SAS and SATA buses, which increases the speed of data exchange.
Moreover, such SCMs are emerging, such as NVDIMM. NVDIMM is produced in the form of DIMM-modules and, in fact, is a hybrid memory, combining RAM DRAM and NAND flash memory.
Under normal conditions, NVDIMMs perform the function of a conventional DRAM memory, but in the event of a system crash or shutdown, the data from the DRAM is transferred to non-volatile flash memory, where it can be stored indefinitely. When the computer resumes, the data is copied back. This approach allows you to speed up the process of starting the machine and reduce the likelihood of loss of important data.
Today, SCM with a PCIe interface can achieve a performance increase of 1000 times (100k IOPS vs. 100 IOPS). Unfortunately, this leads to a significant increase in cost: SCMs are 25 times more expensive than conventional HDDs ($ 1.50 / GB versus $ 0.06 / GB). Enterprise-class devices cost from $ 3000 to $ 5000 each.
To maximize the efficiency of using expensive SCMs, storage systems must constantly provide them with work, that is, keep them occupied. It turns out that we cannot simply replace the magnetic disks - we will have to recycle the hardware systems and software.
This issue should be approached cautiously, since too many flash devices will lead to significant costs of money, and too few of them will lead to difficulties in accessing them. Finding the right balance is not easy.
Also worth remembering is the time division of resources. Over the years, interrupts have been used to communicate between the hard disk and the processor. For a core operating at frequencies measured by gigahertz, it is not difficult to serve an interrupt every few seconds. One core can manage dozens or hundreds of disks without risking to “choke”. However, with the advent of low-latency storage devices, this approach is no longer applicable.
This model should be seriously changed. Not only storage devices received a serious increase in performance - the acceleration of network devices also took place: first up to 10G, then up to 40G, then up to 100G. Maybe it will be possible to "peep" the solution in this area?
There is no definite answer, because the difference in acceleration is too great: the networks are a thousand times faster, and the storage devices are a million. Moreover, when working with memory, it is often necessary to support complex functions of compression, encoding, and deduplication, because optimization techniques used to work with packages are most likely not suitable.
In networks, to reduce latency, a method is applied when all packets are managed by an application bypassing the kernel. However, there is a difference between networks and storage devices: network flows are independent and can be processed in parallel on several cores, in the case of memory all requests will have to be coordinated.
Obviously, this is impractical. One controller is unable to control access to a huge number of SCM devices at the same time. The hardware will be used in half the power, so a different approach is needed.
The load requirements for capacity and performance do not coincide with the hardware capabilities, which leads to restrictions in the use of high-speed disks. For example, data with a capacity of 10 TB with an expected load of 500k IOPS only takes up half the capacity of the disks if they are stored on SCM devices with a capacity of 1TB capable of processing up to 100k IOPS each.
However, it must be remembered that most of the data is not “hot”, so it’s inefficient to store them all on high-speed flash devices. In many cases, the load is consistent with the Pareto distribution: 80% of all hits are addressed to 20% of the data.
A hybrid system with different levels of storage (with different performance characteristics) is a good solution for mixing “cold” and “hot” data, when SCM devices act as a cache for slow disks. But you need to remember that access patterns change over time - you need to react to it in a timely manner and move data.
In well-constructed systems, this method allows you to effectively use the hardware without sacrificing performance. However, systems must have flexible policies that would prohibit active, but low-priority tasks from interfering with business-critical applications. Competent implementation and debugging of these mechanisms is not at all a trivial task.
So what awaits us in the future?
As mentioned above, there are already developed SCM devices. PCIe SSD is the most well-known type of SCM and has already had a significant impact on the infrastructure of data centers. The second example is NVDIMM, which has performance characteristics comparable to DRAM. Such devices are already available today and continue to evolve.
SCM technology deals with HP. Their project called The Machine is nothing more than an attempt to develop a new computer architecture on memristors. The existence of a memristor - the fourth basic component of electrical circuits was predicted in 1971 by Leon Chua (Leon O. Chua), but a laboratory sample of the storage element was created only in 2008 by a team of scientists led by Stanley Williams in the Hewlett- research laboratory Packard
This passive element is able to memorize its own state. It can be said that this is a resistor whose resistance varies depending on the charge flowing through it. When the element is de-energized, the changed resistance is retained.
Currently, the commercial implementation of the memristor is under development. As soon as this happens, it will be possible to create new types of memory that, in addition to storing data, can also process them.
As for The Machine, there is no border between the main memory and the permanent data storage. All memory is operational. This eliminates the problems associated with the transfer of information between devices operating at different speeds.
It seems that SCM-technologies are designed to overcome the inefficiency that occurs when "communicating" slow and fast memory. The more interesting to watch what is happening: how new development will affect all levels of the infrastructure stack. Everything is just beginning.
The head of the development department of the 1cloud.ru project Sergey Belkin comments :
“Different types of discs may be required to solve various problems. The use of disks of various types can be justified when creating multi-level storage systems - data that is often used by applications can be placed on faster disks.
For example, if there is a service that is actively working with the database, then it makes sense to transfer it to a separate SSD disk - this will help optimize the speed of its operation. In this case, the operating system itself is logical to leave on slower disks. The simultaneous use of different types of disks allows you to make the overall infrastructure solution more flexible, efficient and cost-optimized.
With regard to new developments in the field of solid-state drives, last year Intel and Micron announced 3D XPoint (pronounced “crosspoint”) - a three-dimensional transistor-free architecture and stated that the lifespan and speed of such storage devices would exceed the capabilities of NAND memory 1000 times. If this solution becomes commercial, then I think it will most likely be used in data centers for storing frequently requested hot data.
Opinion of George Crump ( George Crump ) from Storage Switzerland:
“SCM is a new type of storage that can become an intermediate link between high-performance DRAM and cheap HDDs. SCM memory is capable of providing a read speed close to the DRAM read speed, and a write speed that is many times faster than hard drives.
This was made possible thanks to the PCIe interface, through which flash storage is connected directly to the processor. However, not any PCI SSD is a SCM device.
Some suppliers, in pursuit of performance, install several controllers on their cards, each of which is responsible for its own flash memory area. At first glance, this seems like a sensible idea, however, in this case, the controller does not have the ability to write or read blocks that are outside its competence.
If the unit is large - this, on the contrary, may adversely affect the speed of work. This and other performance problems arising from the inefficiency of existing interfaces hamper the process of technology adaptation. ”
Opinion of Scott Davis ( Scott Davis ), Technical Director of Infinio:
“SCM technologies will not be available for commercial use until the end of 2016.
Most likely, it will be an early implementation of 3D XPoint technology from Intel. HP and SanDisk also announced that they are working on a joint project, but their product is likely to enter the market not earlier than the beginning of 2017.
It should be borne in mind that, as is the case with many new technologies, SCM devices will at first have a limited area of ​​applicability. An obstacle to entering the wider market will be the cost of devices. ”