
We started testing flash arrays at the request of one of our major customers who could not decide on a storage system solution that would solve their problems. However, the topic turned out to be so relevant and interesting that it soon went beyond the limits of one specific project. Over time, our own methodology was worked out, scripts were written and unique factual material collected. I wanted to share it with my colleagues. Honestly, without unnecessary enthusiasm and myths, just facts. This article will open a series of independent publications, each of which will be devoted to testing a particular array or related technology. However, first we will have to say a few words about how SSD drives differ from ordinary hard drives (HDD) and what features, as a result, appear when testing storage systems based on them.
I apologize in advance for common truths. Winchester (HDD) is a motor, plates, heads and a controller. When reading / writing, the disk controller moves the heads to the desired track, waits for the disk to turn the correct sector and reads / writes data. With this algorithm, performance is directly dependent on the speed of rotation of the spindle and the speed of movement of the heads. Both have mechanical and electromechanical limitations. Significant improvement in these indicators has not been observed for more than a decade (disks with a spindle speed of 15,000 rev / min appeared about 12 years ago).
What is usually measured on hard drives?
1.
IOPS (the number of I / O operations per second) and
Latency (response time) are measured with random (random) load in small blocks. The number of IOPS issued by HDDs depends little on:
- block size (the main delays are related to the mechanics, not the read-write speed from the plates).
- on the type of load (that is, on whether we read or write).
2.
Bandwidth when streaming input output. Indicators are weakly dependent on the type of load, but significantly depend on the position of the heads relative to the center of the disk (Zone Bit Recording)
Note that the speed of the HDD does not depend on the load history, that is, we get the same IOPS on the same load, at the beginning of the test and at the end. HDD with the same spindle speed from different manufacturers, as a rule, practically do not differ in performance - the mechanics are about the same, and the controller has long ceased to be a factor limiting performance.
Now, back to the SSD drives (not necessarily in a disk form factor). A standard SSD consists of a controller (s) and a set of memory chips. Memory chips consist of (very simplistic) blocks (usually 4K) organized into pages. Data is always recorded in free space, sequentially filling in free pages, regardless of whether it is new data or a change to existing ones. Copies of modified data blocks are not erased, but only marked as obsolete. Deletion of “obsolete” copies of data blocks on an SSD is handled by a special process, the
Garbage Collection (GC), which (generally) performs the following operations:
- Select the pages with the highest percentage of outdated copies of the data;
- Overwrites all relevant data to a new page;
- Clears the page.
Usually, the Garbage Collection is performed in the background while the system is not loaded, but with a long write load, this process severely limits the performance of the SSD, since The speed of the Garbage Collection process is noticeably lower than the peak speed of the SSD to write. The phenomenon of a drop in SSD performance during long-term write loads is called
Write Cliff')
Manufacturers of SSD drives are trying to level the impact of the Garbage Collection process through:
- Reservation of a significant number of SSD pages of the drive to absorb the peaks of the load on the record. A number of manufacturers even allow you to do low-level formatting of SSD drives, increasing this reserve at the expense of disk volume, increasing the write performance of SSD.
- Creation of separate service processors involved in "garbage collection" in addition to the main controllers.
SSD performance depends very much on the type of memory chips, the way they are used, the controllers used on the disk, the I / O interface and, unlike conventional HDDs, where, in fact, all drives with the same spindle speed have comparable performance, different SSD drives may vary in performance at times.
What is usually measured on flash drives and flash arrays ?
1. IOPS and latency at an arbitrary (random) load. Unlike HDD, there is a dependence on the block size and type of load, that is, we write or read. Accordingly, in the case of SSDs, it is necessary to make groups of tests with a change in the ratio of the number of read operations to write operations, with varying block size.
2. Changing the performance of SSD during long write operations to determine:
- The maximum amount of data that can be written to the disk array prior to the start of the Garbage Collection process;
- The performance of the Garbage Collection process, which can be considered as the maximum average performance of a disk array on write operations.
It is useless to measure bandwidth when streaming input output, since the SSD architecture implies data fragmentation.
What is important is that after each test to write a disk array defining peak performance, it is necessary to pause to level the influence of the Garbage Collection processes.
The architecture of flash arrays and the optimization of their controllers for SSD use plays a very important role in determining the performance of the entire array. With a peak performance of one SSD drive of 50,000 IOPS, the performance of the disk array controller can be a limiting factor. This often happens when individual manufacturers attempt to make a flash array from the usual one through installing SSD disks into it. In addition, the array controller significantly adds latency, which previously was not noticeable on HDD systems:
- latency hdd ~ 4ms,
- latency array controller ~ 0,2-0,4 ms
- latency SSD of the drive, as a rule <0,2 ms
Not optimized controller can significantly reduce the characteristics of the SSD drive used.
Another important point: the storage SSD is potentially capable of issuing millions of IOPS. When testing, the load generator itself can be a limiting factor, therefore the configuration of the server (s) generating the load must take into account the peculiarities of SSD. It is necessary to properly configure the schedulers, the size of the queues I / O, etc., as much as possible parallelize the test: it is unlikely to get the figures declared by the manufacturer on one LUN for the entire volume of the disk array.
On this, I believe, theory is enough - it's time to move on to practice. Read the following article:
Testing Storage IBM RamSan FlashSystem 820 .
PS The author expresses cordial thanks to Pavel Katasonov, Yuri Rakitin and all other company employees who participated in the preparation of this material.