📜 ⬆️ ⬇️

Introduction to the Storage Performance Development Kit (SPDK)

Clients implementing current-generation solid-state drives, such as the Intel DC NVMe P3700 series , have to cope with a complex problem: since bandwidth is much higher than that of disk drives, and latency is much lower, most of the total transaction time now accounts for the storage software . In other words, the performance and efficiency of the entire storage system is increasingly dependent on the performance and efficiency of the software suite used. At the same time, data carriers are constantly being improved; in the coming years, their capacity will exceed the capabilities of software architectures used in data storage systems.



To assist OEMs and software vendors in integrating such hardware, Intel has created a set of drivers and developed a complete storage architecture called the Storage Performance Development Kit (SPDK). The goal of the SPDK is to highlight the high efficiency and performance achieved by the combination of Intel technologies in the area of ​​networking, computing and data storage. With the help of the SPDK, it was possible to demonstrate that it is possible to achieve excellent performance in millions of I / O operations per second using several processor cores and several NVMe drives without any additional equipment for unloading. Intel provides the complete source code of the reference architecture for Linux under the broader and more liberal BSD license , it is distributed to the community through GitHub . Spdk.io has a blog, a mailing list and additional documentation.

Software Architecture Overview


How does SPDK work? Extremely high performance is achieved through the use of two main techniques: running at the user level using the polled mode drivers (PMD). Let's take a closer look at these two program principles.
')
First, running our device driver code at the user level means, by definition, that the driver code does not run in the kernel. Failure to interrupt and switch the kernel context allows you to save significant computational resources and spend more cycles on the actual storage of data. Regardless of the complexity of the storage algorithms (deduplication, encryption, compression, or regular block storage), the fewer wasted cycles, the better the performance, and the lower the latency. However, I'm not saying that the kernel adds unnecessary costs; it would be more accurate to say that the kernel adds the overhead needed for general purpose computing scenarios that may not be applicable to dedicated storage. The underlying principle of the SPDK is to ensure the lowest delays and the highest efficiency by eliminating any sources of additional software costs.

Second, protected-mode drivers change the underlying I / O model. In the traditional I / O model, an application sends a read or write request, then goes to sleep and waits for an interrupt after an I / O operation completes. Protected mode drivers work differently: the application sends a read or write request, and then switches to another job, periodically checking whether the I / O request is not executed. At the same time, delays and costs associated with the use of interrupts are eliminated, the efficiency of application I / O increases. In the era of using rotating drives (for example, tape drives and hard drives), interrupt costs were a small fraction of the total I / O time, but their use increased the overall efficiency of the system. Now, in light of the proliferation of low-latency SSDs, interrupts already take up a significant portion of the total I / O time. And the lower the delays, the greater the reduction in performance due to interrupts. Systems are already capable of processing millions of I / O operations per second, so eliminating these costs for millions of transactions will lead to a drastic reduction in resource consumption. Packets and blocks are distributed instantaneously, latency is minimized, thereby reducing delays, increasing throughput, and stabilizing delays (less fluctuations).

The SPDK consists of many components that are interconnected and use common elements of the user level and work in the polled mode. Each of these components was created to overcome a specific performance bottleneck discovered during the creation of the full SPDK architecture. In this case, all developed components can be integrated into other architectures other than SPDK, which gives customers the opportunity to use the technologies used in the SPDK to accelerate the work of their own software.



Here are all these components in order from the bottom up.

Hardware drivers


NVMe driver: the main component of the SPDK. This optimized driver provides high scalability, efficiency and productivity.

Intel OuickData Technology: Also known as Intel IOAT Acceleration Technology. This is a copy of the offload subsystem built into the platform based on Intel Xeon processors. By providing access to user space, the threshold for moving DMA data is reduced, making it easier to use small I / O operations or NTB bridges.

Internal block devices


NVMe over Fabrics Initiator (NVMe-oF): From a programmer’s point of view, the local SPM driver, NVMe, and the initiator NVMe-oF share a common set of API commands. This means that it is very easy to enable, for example, local or remote replication.

Ceph RADOS block device (RBD): allows you to use Ceph as an internal device for SPDK. This allows you to use Ceph, for example, as another level of storage.

Blobstore block device: a block device highlighted by SPDK Blobstore. This is a virtual device with which virtual machines or databases can interact. These devices take advantage of the SPDK infrastructure, that is, the absence of locks and the high scalability of performance.

Asynchronous Linux * I / O (AIO): allows SPDK to communicate with kernel devices, such as hard drives.

Storage Services


BDAL Level (Block Device Abstraction Level): This generic block device abstraction layer is the connective medium connecting storage protocols with various device drivers and block devices. It also provides flexible APIs for implementing additional functionality provided to customers (RAID, compression, deduplication, etc.) at the block level.

Blobstore: implements ordered semantics, similar to file semantics (not POSIX *), for SPDK. This component can support high performance databases, containers, virtual machines, or other workloads that are independent of most of the POSIX file system features, such as user access control.

Storage protocols


ISCSI Target: Implementing the installed block traffic specification over Ethernet, about twice as efficient as Linux kernel I / O. The current version uses the TCP / IP kernel stack by default.

NVMe-oF goal: implementation of the new NVMe-oF specification . The purpose of NVMe-oF depends on the RDMA equipment, but it can serve traffic with a throughput of up to 40 Gbit / s per each CPU core.

Target yhost-scsi: KVM / QEMU component using NVMe SPDK driver. This allows virtual machines to access data carriers with reduced latency and reduces the overall CPU usage when executing workloads with intensive I / O operations.

SPDK does not support all data storage architectures. Here are some questions that can be used to determine if the SPDK components are suitable for your architecture.

Is the storage system built on Linux or FreeBSD *?
SPDK is tested and supported mainly on the Linux platform. Hardware drivers are supported on FreeBSD and Linux.

Is the storage hardware platform built on the Intel architecture?
The SPDK package takes full advantage of the Intel platform, is tested and optimized for microprocessors and Intel systems.

Is storage performance in user mode?
SPDK has been able to achieve increased performance and efficiency by reallocating user-mode performance. If you implement SPDK functions in applications, such as NVMe-oF targets, initiator or Blobstore, you can run the entire data processing pipeline in user space and thereby significantly improve work efficiency.

Does the system architecture allow non-blocking PMD drivers to be in-line
model?
Since PMD drivers are constantly running in their own threads (instead of going to sleep and releasing the processor when idle), they have strict requirements for the flow model.

Does the system support the Data Plane Development Kit (DPDK) for working with loads consisting of network packets?
The SPDK uses the same primitives and programming models as DPDK, so customers using DPDK currently can use the SPDK integration. If users use SPDK, adding DPDK functions for network processing can provide significant benefits.

The development team has the necessary knowledge to independently study and troubleshoot?
Intel is not obligated to support this software. Both Intel and the open source software development community working with the SPDK will take commercially reasonable actions to investigate possible errors of un-released software, but Intel will under no circumstances commit to customers for any service or support this software solution.

To learn more about the SPDK, please fill out the contact request form or visit the SPDK.io website to access the mailing list, documentation, and blogs.

The capabilities and benefits of Intel technology vary by system configuration. Some hardware, software, or service activation may be required. Performance may vary depending on system configuration. For more information, contact the manufacturer or seller of your equipment or visit intel.com .

The software and workloads used in the performance tests could be optimized to achieve the highest performance only on Intel microprocessors. Performance tests, such as SYSmark * and MobileMark *, are conducted using specific computer systems, components, software, operations, and functions. Any change to any of these factors may lead to a change in results. When selecting products to be purchased, refer to other information and performance tests, including performance tests of a particular product in combination with other products.

Configuration for performance tests



For more information, see.

Source: https://habr.com/ru/post/326094/


All Articles