📜 ⬆️ ⬇️

Basics of thin allocation of volumes on storage systems (or anniversary 3PAR thin provisioning)

This year marks 10 years since the sale of the first 3PAR storage system with Thin Provisioning technology. And despite the fact that the technology has become very popular and in demand, I still haven’t managed to come up with an sensible description of how it works at a low level. In this article I will try to highlight the most "dark", in my opinion, the sides of thin provisioning - the technical basis of this technology. That is, how exactly the host interacts with the storage system. These technologies are no longer exclusive to 3PAR, since they are now industry standards, but since the thin provisioning technology first appeared in 3PAR, I will allow myself to give all the laurels to these arrays.


Why thin provisioning is needed


For those who missed the previous 10 years, I still briefly remind you what thin provisioning is and what it is for, and the rest can, with a clear conscience, skip this section.

Thin provisioning is a storage virtualization technology that increases the efficiency of storage system resources. This technology is necessary to reduce the use of disk space, which is not directly used for storing application data. In particular, file systems are never 100% full under normal conditions. However, you always need to have a certain amount of free space to ensure the normal functioning of the file system and to ensure readiness for data growth. This is not actually used space is allocated for all logical volumes on the storage system. Logical volumes, the disk space for which is allocated in full at the time of creation on the storage system, are called “toast”. Such a model of using disk resources appeared together with the first data storage devices and is still alive.

')
It has the following disadvantages:



To solve such problems, thin provisioning and thin reclamation were invented, which we will discuss in more detail.

How thin provisioning works


The concept of thin provisioning is simple and consists of the following:




Based on the foregoing, it is easy to provide a schema for the work of thin provisioning. When the storage system receives a SCSI Write command (encapsulated on an FC, SAS, iSCSI, etc.) stack, it allocates another piece of data and writes data from SCSI Write there. In the case of 3PAR, blocks are allocated in the size of 16K.

How thin reclamation works


And now we will discuss much more interesting and non-obvious points - how the host interacts with the storage system to return the free disk space to the common pool. The interaction between the host and the storage system is an extremely important nuance, since only the host knows which blocks can be deleted and which not. The thin reclamation technology was first implemented on 3PAR arrays and today is an industry standard approved by the International Committee for Information Technology Standardization (INCITS). The document is called T10 SBC-3 and expands the SCSI standard with new commands to interact with storage systems (these commands were added in the eighteenth revision of the document on February 23, 2009). There is a similar standard for ATA / SATA devices.

To implement thin provisioning, the standard provides for 3 SCSI commands:
  1. UNMAP
  2. WRITE SAME
  3. GET LBA STATUS

The standard requires all storage systems with thin provisioning to support at least the UNMAP command or the WRITE SAME command with the unmap bit. Consider the API described by the protocol.

UNMAP

Tells the storage system to release one or more Logical Block Address (LBA) sequential groups. The storage system should mark LBA data as free (unmapped, in terms of SCSI), free up space on the backend and background process to erase the data that was previously there in case these blocks are then allocated to another host. In this command, only service information is transmitted in the form of a set of pairs consisting of “LBA Address” and “Number of Logical Blocks”.


WRITE SAME

If for some reason the host does not want to use the UNMAP command, it can get a similar effect with the WRITE SAME command. The unmap bit field is provided for this. If the WRITE SAME command with the unmap bit set comes to the array with thin provisioning and the volume on the array is thin, then the array will do the same as in the case of the UNMAP command. It differs from the UNMAP (42h) command in that using WRITE SAME it is not possible to specify a large number of blocks for release. You can specify only one pair of "LBA Address" and "Number of Logical Blocks".

Also, do not forget that the WRITE SAME command is primarily a command for recording data. In the event that the unmap bit is not set, the storage system does not support thin provisioning, or the volume is thick, then the usual write operation will be performed using the specified LBA. From this it follows that in these cases, SCSI READ must return exactly the data that was written there. Here, some manufacturers like the same Hewlett are cunning, and instead of sequentially writing data of the same type (for example, zeros), these blocks are marked in the logical volume metadata as allocated but “filled with zeros”. This technology is called zero detection.


GET LBA STATUS

This is a service operation (device-specific) and it uses the command code SERVICE ACTION IN (9Eh). It allows you to find out the server:
1. Does volume thin provisioning.
2. The status of a specific unit on the storage system (whether real capacities are allocated for it on the backend or not).
3. Granularity thin provisioning for volume.
4. Limits (alarm level and maximum volume).

The command is very useful, for example, for background searches from the host side for blocks allocated on an array but not used by the host for data storage or when moving from thick to thin volumes.


As a conclusion.


I am very glad that you have read to the last lines! Unfortunately, I didn’t say anything about file storage, database and OS support for thin provisioning, I didn’t tell when it makes sense to use it at all - and this is very interesting in my opinion, but unfortunately a voluminous topic. Maybe I will come back to her later.

Source: https://habr.com/ru/post/170389/


All Articles