IBM FlashSystem 900 storage system overview

Overview and testing of flash storage from IBM FlashSystem 900. Photos, basic principles and some synthetic tests inside.

Modern information systems and the pace of their development dictate their own rules for the development of IT infrastructure. Storage systems on solid-state media have long been transformed from luxury into a means to achieve the necessary disk guarantor SLA. So, here it is, a system capable of issuing over a million IOPS.
')

Specifications

Basic principles

This storage system is a flash array with increased speed due to the use of MicroLatency modules and optimization of MLC technology.

When I asked our company what technologies are used for fault tolerance and how many gigabytes are actually hidden inside (IBM claims 11.4 TB of clean space), he responded noncommittally.

As it turned out, everything is not so simple. Inside each module there are memory chips and 4 FPGA controllers built on them by a Raid with a variable stripe (Variable Stripe RAID, VSR).

Module internals, two double-sided boards

Each module chip is divided into so-called layers. On each N-layer of all chips, a variable-length Raid5 is built inside the module.

At failure of one layer on a chip, the length of the stripe is reduced, and a broken memory cell is no longer used. Due to the excess number of memory cells useful volume is saved. As it turned out, the system is much more than 20 TB raw flush, i.e. almost at the level of Raid10, and due to redundancy we do without restructuring the entire array when a single chip fails.

Having received a Raid at the module level, FlashSystem integrates the modules into a standard Raid5 (if this post gets 20 likes before January 1, I agree with everyone to conduct a test with forcing the module out at maximum load).
Thus, to achieve the required level of fault tolerance, from a system with 12 modules of 1.2 TB each (marking on the module) we get a little more than 10 TB.

Web-based system interface

Yes, it turned out to be an old friend (hello to v7k clusters) with a terrible function of pulling a locale from a browser. In FlashSystem, the management interface is similar to Storwize, but they differ significantly in functionality. In FlashSystem, the software is used for setting up and monitoring, and the software layer (virtualizer) is not available as in the stopup, since the systems are designed for different tasks.

Testing

After receiving the system from a partner, we install the system into a rack and connect to the current infrastructure. Honestly, when you hold this piece of iron in your hands, 2 units high and you realize that 1,100,000 iops fit inside and, at the same time, a bundle of kilo-green paper, 2 units high, you instinctively call a colleague to assist in moving it.

We connect the storage system according to a pre-agreed scheme, configure zonning and check availability from the virtualization environment. Next, we are preparing a laboratory stand. The stand consists of 4 blade servers connected to the test storage system by two independent 16 Gbit optical factories.

Wiring diagram

Since my organization leases virtual machines, the test will evaluate the performance of a single virtual machine and a whole cluster of virtual loops running vSphere 5.5.

We optimize our hosts a bit: let's configure multithreading (roundrobin and limit the number of requests), also increase the queue depth on the FC HBA driver.

ESXi Settings

Our settings may differ from yours!

On each server blade, we create one virtual machine (16 GHz, 8 GB of RAM, 50 GB system disk). We will connect 4 hard disks to each machine (each on its own flash moon and on its own Paravirtual controller).

VM settings

In testing, we consider synthetic testing with a small 4K block (read / write) and a large 256K block (read / write). The storage system consistently gave 750k IOPS, which for me looked very good, in spite of the 1.1M IOPS space figure stated by the manufacturer. Do not forget that everything is pumped through the hypervisor and OS drivers.

IOPS charts, delays and, it seems to me notrim

1 VM, Block 4k, 100% reading, 100% random. When providing all the resources from one virtual machine, the performance graph behaved nonlinearly and jumped from 300k to 400k IOPS. On average, we got about 400k IOPS:

4 VM, Block 4k, 100% reading, 100% random:

4 VM, Block 4k, 0% reading, 100% random:

4 VM, Block 4k, 0% reading, 100% random, 12 hours later. I did not see a drawdown in performance.

1 VM, Block 256k, 0% reading, 0% random:

4 VM, Block 256k, 100% reading, 0% random:

4 VM, Block 256k, 0% reading, 0% random:

Maximum system capacity (4 VM, Block 256k, 100% read, 0% random):

I also note that, like all known vendors, the declared performance is achieved only in greenhouse laboratory conditions (a huge number of uplink SANs, a specific LUN breakdown, the use of dedicated servers with RISK architecture and specially configured load generating programs).

findings

Pros : great performance, easy setup, friendly interface.
Cons : Outside the capacity of one system, scaling is carried out with additional shelves. “Advanced” functionality (snapshots, replication, compression) is rendered into the storage virtualization layer. IBM has built a clear storage hierarchy, headed by a storage virtualizer (SAN Volume Controller or Storwize v7000), which provides multi-layered, virtualized and centralized management of your storage network.

Bottom line : IBM Flashsystem 900 performs its task of processing hundreds of thousands of IO. In the current test infrastructure, it was possible to get 68% of the performance declared by the manufacturer, which gives an impressive performance density on TB.

Source: https://habr.com/ru/post/273803/

All Articles

IBM FlashSystem 900 storage system overview

Basic principles

Testing

findings

More articles: