📜 ⬆️ ⬇️

New drives - what's the future?


If in agriculture in the fall they count chickens, then in the IT industry at this time traditionally announce novelties. And although it is far to the end of autumn and there are chances that there will still be interesting announcements, it has already been announced enough to declare it worthy of attention. Moreover, some trends are very curious.

At first glance, everything is pretty obvious: hard drive manufacturers staged another volume race, enthusiastically announcing 6, 8, and even 10-TB models. But everything is in the details, and to be more precise - in the specifics of the use of these discs. Immediately make a reservation that we will talk about the server aspects of disk usage.

So what's in the details?

')
First, a few words about the general trend: all the few remaining hard drive manufacturers are diligently expanding their product lines. More recently, the choice was simple: besides the volume, it was enough to determine the speed of rotation of the plates and the interface, after which the choice was reduced to one or two series. What now? Enterprise, Cloud, AV, NAS, Green, Performance. The reason for this diversity is simple: disk manufacturers are moving away from universality for the sake of reducing their own production costs, leaving only the minimum set of features that are needed in the disks. But the bonus of this approach becomes a storm of marketing materials, continuously pouring out on the user.

Well, now let's talk about specifics. Let's start with WD.


WD Ae

Users did not have time to discuss the Red Pro series, intended for a very narrow sector of the “not very big NAS” market, as the company rolled out the absolutely wonderful Ae series, which has already become the fourth series of data center drives. The “miracle” of the series is that the volume of the disk (there is only one in the series so far) is floating and currently ranges from 6 to 6.5 TB. As far as we could make the party, we did so. It is obvious that such an unusual decision can be sold only in “volumes” or in batches, and not at all in “pieces”, otherwise the buyers will not understand why a neighbor has more money for the same money. The manufacturer itself is talking about batches "from twenty discs".

The second interesting feature of these discs was the load they were designed for. We open the press release and read that MTBF is only 500,000 hours - two to four times less than is usually stated for server disks. But the most interesting is the following: for this disk, the annual load is assumed to be 60 TB. Yes, no mistake, 10 overwrites of the entire volume per year.


Disk shelf ETegro Fastor JS200 G3

The solution to these values ​​is simple: the disk is designed to store cold, one might even say, ice data. In fact, we have a disk designed to replace tape libraries, whose task is 95% of the time to lie in deep sleep. The reason for this transition was the cost of storing data, which on disks from 7200 rpm is about 5-6 cents per gigabyte. This is still more expensive than storing data on tapes, but it’s already acceptable enough for most for the price. And if the data access speed on the tape is at best a few seconds, then you can get data from a disk in a disk shelf in less than a hundred milliseconds, while there are no difficulties in increasing the number of disks, extending the shelves by at least an entire rack.


Seagate Enterprise Capasity, 8 TB


HGST, 10 TB
The next "interesting" we have provided Seagate and HGST. The first ones announced that they already ship 8-TB hard drives, and the latter noticed that if we add to this the proprietary technology of disk filling with helium that is already used in the Ultrastar He6 and He8 series , you can easily get 10 TB from a disk of standard sizes. And these two new items are united by the fact that they use the technology of tiled recording (Shingled Magnetic Recording, SMR).


The essence of the tiled record

Recall that the SMR technology uses the fact that the width of the head for reading in disks is smaller than that of the head for writing. Therefore, you can keep track of tracks with them superimposed on each other, leaving "on the surface" only a relatively narrow area of ​​each track, sufficient for confident reading. The recording is carried out with tracks of large width by fields of high power. In sum, this makes it possible to increase the recording density due to the elimination of the inter-track space while maintaining reliability.

Talk about it has been going on for a very long time, but everything goes to the fact that 2015 will be the year of its real appearance. This was a long time ago, it is enough to recall, for example, this rather old schedule:


The development of hard drive technology

The technology itself is very beautiful, because it allows you to increase the density without a radical change in production equipment, going the same way as it was once passed when entering disks with 4-kB sectors (Advanced Format). But it has one drawback: as soon as you need to modify the data, that is, make an entry in the area around which there is data, how you come across the fact that with this you will erase this very neighboring data.


Overlay recording area

The way out of this situation is obvious: tiles are laid with ribbons consisting of several tracks. And between the tapes provide a standard intertrack gap. And when recording, it is necessary to take part of the tape (from the modified data to the end of the tape), modify it, and then write it back. On the one hand, the tape should be wide enough so that the effect of the tiled overlay of tracks was still noticeable, and on the other hand, it would take longer to read a larger tape, write it down longer, and store it somewhere. The hard drive manufacturers have not yet disclosed the details of the implementation, but presumably the tape size is several dozen tracks, and each track, for a minute, now amounts to more than 1 MB.

A bit of math to unwind: finding out the approximate amount of data on a track is not so difficult: take the disk speed in MB / s, divide by the speed of rotation, and multiply by those 60 seconds per minute. Get the number of megabytes read per revolution ... that is, per track. So, for current 4-TB drives at 7,200 rpm, these values ​​are respectively about 180 MB / s and 1.5 MB per track.

I think everyone already understood what we are driving at. Yes, we get terrible response to records with random addressing data, on average we will read several tens of megabytes, and then write them back, which will take a tangible fraction of a second. Not as bad as on tape drives, but you have to forget about the usual milliseconds. With reading, of course, everything will be quite traditional, which means that the disks are quite suitable for any use, implying a WORM (Write Once, Read Many) scheme or storage of cold data. Theoretically, disk manufacturers can creatively learn from the experience of using SSDs and adapt TRIM and translation of addresses for themselves, but for now they only talk about this at the level of theory.

There is another important consequence of such an organization of hard drives. And it consists in the fact that, it seems, switching to such disks will completely kill the RAID-arrays with the recording of checksums. The size of the recording blocks there is much smaller than the size of the tape in SMR, and this means that with any rewriting of old data we get a monstrous penalty. An “excellent” addition to the fact that RAID5 on arrays of 7200 rpm disks with typical non-recoverable 1 in 10 ^ 15 bit rate already becomes an unreliable storage option with an array size of more than 100 TB — such an error at least once but will happen during array rebuilding. But such a volume is just one disk shelf. The cherry on the cake is the completely inhuman time of the rebuild of such arrays, during which the disk subsystem works with reduced performance. So there is every chance that in the near future the most popular way to store data will be numerous disk shelves with JBODs, the data on the disks of which will be backed up by the OS repeatedly.

As for the disk subsystem for “hot” data, the SSD ball has been held here for years now and no radical changes are foreseen. Depending on the amount of data, everything will be solved either by local drives for the server, such as the HGST Ultrastar SN100 or Intel SSD DC P3700, since their volume has already reached 2 TB, or whole all-flash disk shelves. The first ones moved to the NVMe interface, which provides minimal delays and more efficient operation under really high loads (for more details, see our previous article ). NVMe protocol developers are not wasting their time and have already announced the development of the NVMe over Fabrics standard, which will allow to use all the advantages of the NVMe protocol when working in data exchange environments such as Ethernet using RDMA, InfiniBand, and Intel Omni Scale Fabric. And the Fiber Channel Industry Association (FCIA), by the way, has already organized a separate working group for this case. But this is still a matter of a somewhat distant future; in the near future, we have seen a clear migration to the SAS 12G standard, under which the infrastructure in the form of controllers and expanders began to emerge.

Source: https://habr.com/ru/post/238535/


All Articles