📜 ⬆️ ⬇️

Testing flash storage. EMC XtremIO

In the middle of 2012, EMC paid $ 430 million for an Israeli startup that was opened three years earlier. At the development stage, in fact, half a year before the expected appearance of the first XtremIO device. To order, the first devices became available only at the end of 2013.

The main distinguishing feature of XtremIO lies in its architecture and functionality . First, the architecture initially includes constantly running and non-switchable services, such as inline deduplication, compression, and thin provisioning, which save space on the SSD. Secondly, XtremIO is a horizontally-scalable cluster of modules (X-Bricks), between which data is automatically and evenly distributed. At the same time, the standard x86 equipment and SSD are used, and the functionality is implemented by software. As a result, it turns out not just a fast disk, but an array that allows you to save capacity due to deduplication and compression, especially in such tasks as server virtualization, VDI or databases with multiple copies.


Love for different kinds of tests is not EMC's strength. Nevertheless, thanks to the initiative assistance of the local office, for us, in the depths of the remote laboratory, a stand was assembled that included 2 X-Brick systems. That allowed us to conduct a series of tests as close as possible to the method developed by us .
')
Testing was conducted on version 2.4 code, version 3.0 is now available, in which half the delay is stated.


Testing method


During testing, the following tasks were solved:

Testbed Configuration


Figure 1. The block diagram of the test bench.
The test bench consists of 4 servers, each of which is connected by four 8Gb connections to 2 FC switches. Each switch has 4 8Gb FC connections to EMC Xtream-IO storage. On the FC switches, zones are created in such a way that each initiator is in the zone with each storage port.
Server ;
Storage system
As additional software, Symantec Storage Foundation 6.1 is installed on the test server, which implements:

See tiresome details and all sorts of clever words.
On the test server, the following settings were made to reduce disk I / O latency:
  • Changed the I / O scheduler from “cfq” to “noop” by assigning the value to the noop parameter; /sys/<___Symantec_VxVM>/queue/scheduler
  • The following parameter has been added to /etc/sysctl.conf minimizes the queue size at the level of the Symantec logical volume manager: «vxvm.vxio.vol_use_rq = 0» ;
  • The limit of simultaneous I / O requests to the device is increased to 1024 by assigning the value 1024 to the parameter /sys/<___Symantec_VxVM>/queue/nr_requests
  • Disabled checking of the possibility of merging i / v operations (iomerge) by assigning the value 1 to the parameter /sys/<___Symantec_VxVM>/queue/nomerges
  • Disable read-ahead by setting the value 0 to /sys/<___Symantec_VxVM>/queue/read_ahead_kb
  • The default (30) queue size on the FC HBA is used;

On the storage system, the following configuration settings are performed for partitioning disk space:
  • On storage, by default, all space is marked up and there is the possibility of only a logical partitioning of physical capacity.
  • 32 LUNs of the same size are created on the storage system, cumulatively occupying 80% of the storage capacity, the moons are presented with 8 each server.

Software used in the testing process


To create a synthetic load (performance of synthetic tests) on the storage system, the Flexible IO Tester (fio) version 2.1.4 utility is used. All synthetic tests use the following fio configuration parameters of the [global] section:
  • direct = 1
  • size = 3T
  • ioengine = libaio
  • group_reporting = 1
  • norandommap = 1
  • time_based = 1
  • randrepeat = 0

The following utilities are used to remove performance indicators under synthetic load:
  • interface and means of monitoring and diagnostics storage;
  • fio version 2.1.4 to generate a summary report for each load profile


Testing program.


The tests were performed by creating a synthetic load simultaneously from four servers using the fio program on a Block Device, which is a logical volume of type stripe, 8 column, stripewidth=1MiB , created using Veritas Volume Manager from 8 LUNs presented from the system under test on each server. The created volumes are previously completely filled with data.
Testing consisted of 2 groups of tests:
Ask for details
Group 1: Tests that implement long-term random write.

When creating a test load, the following additional parameters of the fio program are used:
  • rw = randwrite
  • blocksize = 4K
  • numjobs = 10
  • iodepth = 8

The test duration is 18 hours.
According to the test results, based on the data output by the vxstat command, graphs are generated that combine the test results:
  • IOPS as a function of time;
  • Latency as a function of time.

The analysis of the received information is carried out and conclusions are drawn about:
  • The presence of performance degradation during long-term load on the record and read;
  • The performance of the service processes storage (Garbage Collection) limiting the performance of the disk array to write during a long peak load;

Group 2: Disk array performance tests under various types of load.

During testing, the following types of loads are investigated:
  • load profiles (changeable software parameters fio: randomrw, rwmixedread):
  1. random recording 100%;
  2. random write 30%, random read 70%;
  3. random read 100%.
  • block sizes: 1KB, 8KB, 16KB, 32KB, 64KB, 1MB (changeable software parameter fio: blocksize);
  • methods of processing I / O operations: asynchronous (variable software parameter fio: ioengine);

Tests are performed in 3 stages:
  • For each combination of the above types of load, by varying the fio numjobs and iodepth parameters, there is a storage saturation point, that is, a combination of jobs and iodepth, at which the maximum iops is reached, but the delay is minimal. Recorded by the program fio indicators iops, latency, numjobs and qdepth.
  • then the tests are carried out similarly to the previous stage, only the point at which approximately twice the performance is achieved is searched.
  • then similar tests are carried out, the point is still half the performance.

Such an algorithm allows testing to determine the maximum performance of the disk array for a given load profile, as well as the dependence of latency on the load.
According to the test results, graphs and tables of the obtained iops and latency are built, the results are analyzed and conclusions are drawn about the performance of the storage system.

Test results


Investigation of disk array performance on synthetic tests.

Group 1: Tests that implement long-term random write.

The test results are presented in the form of graphs (Figure 2 and 3).
View graphics.
Figure 2. IOPS during long-term recording (4K block)

Figure 3. Delay during long recording (4K block)


Main conclusions:
During long-term load, the performance drop was not recorded with time. The phenomenon of "Write Cliff" is missing. Therefore, when choosing a disk subsystem (sizing), you can count on stable performance regardless of the load duration (usage history of the disk array).

Group 2: Disk array performance tests under various types of load.

The test results are presented in the form of graphs (Fig. 4-9) and summarized in Tables 1-3.
View charts and tables.
Figure 4. IOPS with random write.
Figure 5. Delay in random recording. (ms)
Table 1. Random write performance
Figure 6. IOPS with mixed IV (70% read 30% write)
Figure 7. Delay in mixed I / O (70% read 30% write) (ms)
Table 2. Performance with mixed IV (70% reading 30% record)
Figure 8. IOPS with random reading.
Figure 9. Delay in random reading. (ms)
Table 3. Random read performance.


Maximum recorded storage performance parameters:

Record:

Reading:

Mixed load (70/30 rw)



As an added bonus, we were shown the performance of “snapshots”. The 32 snapshot shots taken simultaneously from the 32-max, maximally loaded for recording, the moons did not have a visible impact on performance, which can not but arouse admiration of the system architecture.

findings


Summing up, we can say that the array made a favorable impression and demonstrated the iops declared by the manufacturer. Compared to other flash solutions, it is distinguished by:
  1. No performance degradation on write operations (write cliff)
  2. Online deduplication, allowing not only to save in volume, but also to win in the recording speed.
  3. The snapshot function and their impressive performance.
  4. Scalability from 1 to 4 nodes (X-brick)

The unique architecture and data processing algorithms of the array allow you to implement great functionality (deduplication, snapshots, thin moons).

Unfortunately, the testing was not complete. After all, maximum performance is expected from a system consisting of 4 X-Brick. In addition, version 2.4 was provided to us, while version 3.0 was already released, with half the delay stated. As before, the issues of the operation of an array with large blocks (its maximum throughput) and work with synchronous I / O, where latency is critical, remain unclear. We hope that soon we will be able to close all these "white spots" with additional research.


PS The author expresses cordial thanks to Pavel Katasonov, Yuri Rakitin and all other company employees who participated in the preparation of this material.

Source: https://habr.com/ru/post/252069/


All Articles