📜 ⬆️ ⬇️

DotHill 4824 storage overview

The hero of this review will be the modest DotHill 4824 storage system. Surely many of you have heard that DotHill as an OEM partner is producing entry-level storage for Hewlett-Packard — the most popular HP MSA (Modular Storage Array) already existing in the fourth generation. The DotHill 4004 line corresponds to the HP MSA2040 with a few differences, which will be described in detail below.

DotHill is a classic entry level storage system. Form factor, 2U, two options for different drives and with a large variety of host interfaces. Mirrored cache, two controllers, asymmetric active-active with ALUA. Last year, a new functionality was added: disk pools with three-level tiering (tier data storage) and SSD cache.



Specifications



')

Disk pools in DotHill


For those who are not familiar with the theory, it is worthwhile to talk about the principles of operation of disk pools and longline storage. More precisely, about specific implementation in DotHill.

Before the advent of pools, we had two limitations:


A disk pool in a DotHill storage system is a collection of several disk groups with load distribution between them. From the point of view of performance, the pool can be considered as a RAID-0 of several subarrays, i.e. we are already solving the problem of short disk groups. In total, only two disk pools are supported on the storage system, A and B, one for the controller), each pool can have up to 16 disk groups. The main architectural difference is the maximum use of free placement of stripe on disks. Several features and features are based on this feature:

Differences from HP MSA2040


  1. MSA 2040 is an HP product , with all the consequences and benefits. First of all - an extensive network of service centers and the ability to purchase various packages of extended support. Only the distributors and their partners are engaged in the Russian-language support and maintenance of DotHill. All that remains is the optional 8x5 support with the next business day response and the delivery of spare parts the next business day.

    Documentation is completely identical to HP's, with the exception of names and logos, i.e. equally high quality and detailed. Of course, HP has many additional FAQs, best practices, descriptions of reference architectures involving MSA2040. The HP web interface (HP SMU) differs only in a company font and icons.

    Price Of course, nothing is given in vain. The price for MSA2040 in common configurations (two controllers, 24 450-600GB 10k disks, 8Gbit FC transceivers) is about 30% higher than the DotHill 4004. Of course, without CarePacks. Also excluding HP's separately expanding number of snapshots (from 64 to 512) and asynchronous replication (Remote Snap), which add several thousand USD to the cost of the solution. Our statistics show that additional CarePacks for MSA in the territory of the Russian Federation are extremely rarely bought, after all, this is a budget segment. The implementation in most cases is carried out on its own, sometimes it is carried out by the supplier within the framework of a common project, extremely rarely by HP engineers.

  2. Discs . DotHill and HP have “their own”, that is, HDD and SSD with non-standard firmware are used, the slide comes with disks. DotHill offers only nearline SAS from 3.5 "disks, i.e., with a spindle speed of 7200 rpm, i.e. all fast disks - only 2.5" *. HP offers 300, 450 and 600GB 15,000 rpm disks in a 3.5 "version.
    * Update 03/25/2015: 10k and 15k disks in the 3.5 "version of DotHill are still available on request.
  3. Related products . HP MSA 1040 is a budget version of MSA 2040 with certain limitations:
    • No SSD support. Optional auto-tiering is still there, it's just 2-level.
    • Smaller host ports - 2 per controller instead of 4. There is no support for 16Gbps FC and SAS.
    • Fewer disks - only 3 additional disk shelves instead of 7.




    So if you do not need all the features of MSA 2040, then you can save about 20% and get very close to the cost of the "original" (but full-featured).

    DotHill has another variation - the AssuredSAN Ultra series. These are storage systems with exactly the same controllers, functionality and control interface as the 4004 series, but with a high disk density: Ultra48 - 48x 2.5 "in 2U and Ultra56 - 56x 3.5" in 4U.





Performance



Storage configuration


Host configuration



Connection was made through one controller, direct, through 4 8Gbps FC ports. Naturally, the mapping of volumes to the host was through 4 ports, and multipath was configured on the host.

Pool with tier-1 and SSD cache


This test is a three-hour (180 cycles of 60 seconds) load with random access by 8KiB blocks (8 threads with a queue depth of 16 each) with a different read / write ratio. The entire load is focused on the 0-20GB area, which is guaranteed to be less than the amount of performance tier or cache on an SSD (800GB) - this is done to quickly fill the cache or tier in a reasonable time.

Before each test run, the volume was created anew (to clear the SSD-tier or SSD cache), was filled with random data (sequential write in 1MiB blocks), the volume was not read ahead. The values ​​of IOPS, average and maximum delay were determined within each 60-second cycle.

Tests with 100% reading and 65/35 reading + writing were performed both with SSD-tier (a disk group of 4x400GB SSD in RAID-10 was added to the pool) and with SSD cache (2x400GB of SSD in RAID-0, storage does not allow adding more than two SSDs to the cache for each pool). The volume was created on a pool of two RAID-6 disk groups of 10 46-GB disks of 15 thousand SPS rpm (that is, in fact, this is RAID-60 according to the 2x10 scheme). Why not 10 or 50? To intentionally complicate random storage for storage systems.

Iops


The results were quite predictable. As the vendor claims, the advantage of SSD cache over SSD tier is in filling the cache faster, i.e. DSS reacts faster to the appearance of “hot” areas with an intense load on random access: 100% reading IOPS grow together with a drop in delay faster than using tier'ing.

This advantage ends as soon as a significant write load is added. RAID-60, to put it mildly, is not very suitable for random recording in small blocks, but this configuration was chosen specifically to show the essence of the problem: DSS cannot cope with the recording, because it bypasses the cache for a slow RAID-60, the queue is quickly clogged, there is little time to service read requests, even taking into account caching. Some blocks get there all the same, but quickly become invalid, because there is a record. This vicious circle leads to the fact that with such a load profile, the cache that works only on reading becomes ineffective. Exactly the same situation could be observed with the early versions of the SSD cache (before the appearance of Write-Back) in PCI-E RAID controllers LSI and Adaptec. The solution is to use an initially more productive volume, i.e. RAID-10 instead of 5/6/50/60 and / or SSD-tier instead of cache.


Average delay



Maximum delay


This graph uses a logarithmic scale. In case of 100% SSD cache usage, you can see a more stable delay value - after filling the cache, the peak values ​​do not exceed 20ms.

What can be summed up in the dilemma of "caching versus long-term storage (tiering)"?
What to choose?

4x SSD 400GB HGST HUSML4040ASS600 RAID-10


The volume was tested on a linear disk group - RAID-10 out of four 400GB SSDs. In this delivery of DotHill, HGST HUSML4040ASS600 turned out to be abstract “400GB SFF SAS SSD”. This is an Ultrastar SSD400M SSD with a fairly high declared performance (56000/24000 IOPS 4KiB read / write), and most importantly - a resource of 10 overwrites per day for 5 years. Of course, now in the arsenal of HGST there are more productive SSD800MM and SSD1600MM, but for DotHill 4004 there is quite enough of these.

We used tests designed for single SSDs - “IOPS Test” and “Latency Test” from the SNIA Solid State Storage Performance Test Specification Enterprise v1.1:

The test consists of a series of measurements - 25 rounds of 60 seconds. Preload - sequential recording in blocks of 128KiB until reaching 2x capacity. The steady state window (4 rounds) is checked by plotting. Criteria of steady state: linear approximation within the window should not exceed the limits of 90% / 110% of the average value.


SNIA PTS: IOPS test



As expected, the stated limit of the performance of a single small block IOPS controller was achieved. For some reason, DotHill indicates 100000 IOPS for reading, and HP for MSA2040 - more realistic 80000 IOPS (40 thousand per controller), which we see on the graph.

For the test, a single HGST HGST HUSML4040ASS600 SSD was tested with a SAS HBA. On the 4KiB block, about 50 thousand IOPS were received for reading and writing, with saturation (SNIA PTS Write Saturation Test) the record was dropped to 25-26 thousand IOPS, which corresponds to the characteristics stated by HGST.

SNIA PTS: Latency Test


Average delay (ms):

Maximum delay (ms):

The average and peak delay values ​​are only 20-30% higher than those for a single SSD when connected to the SAS HBA.

Conclusion


Of course, the article turned out to be somewhat chaotic and does not answer the several important questions:


Links


Source: https://habr.com/ru/post/253907/


All Articles