This year marks 10 years since the sale of the first 3PAR storage system with Thin Provisioning technology. And despite the fact that the technology has become very popular and in demand, I still haven’t managed to come up with an sensible description of how it works at a low level. In this article I will try to highlight the most "dark", in my opinion, the sides of thin provisioning - the technical basis of this technology. That is, how exactly the host interacts with the storage system. These technologies are no longer exclusive to 3PAR, since they are now industry standards, but since the thin provisioning technology first appeared in 3PAR, I will allow myself to give all the laurels to these arrays.
Why thin provisioning is needed
For those who missed the previous 10 years, I still briefly remind you what thin provisioning is and what it is for, and the rest can, with a clear conscience, skip this section.
Thin provisioning is a storage virtualization technology that increases the efficiency of storage system resources. This technology is necessary to reduce the use of disk space, which is not directly used for storing application data. In particular, file systems are never 100% full under normal conditions. However, you always need to have a certain amount of free space to ensure the normal functioning of the file system and to ensure readiness for data growth. This is not actually used space is allocated for all logical volumes on the storage system. Logical volumes, the disk space for which is allocated in full at the time of creation on the storage system, are called “toast”. Such a model of using disk resources appeared together with the first data storage devices and is still alive.

')
It has the following disadvantages:
- Space allocated to one logical volume (but not used) cannot be used by another volume. With the rapid growth of the volume of data on one logical volume, sooner or later we will rest on its size and the fact that there is a lot of unused disk space on other logical volumes will not help us. That is, the free disk space is not represented by a common pool from which any volume can take capacity if necessary, but is in fact rigidly tied to each volume. Besides the fact that it is terribly wasteful, this scheme is also inconvenient if you need to redistribute the capacity between the volumes.
- Since it is often very difficult to predict the growth of application data, it is usually the size of thick volumes that is chosen with a substantial margin. According to various studies, the utilization rate of storage systems with thick volumes ranges from 30 to 50 percent. However, disk space not used for application data costs a certain amount of money that could be spent on much more useful things.
- When replicating or using snapshots on thick volumes, the disk array works with unused host blocks, as well as with used ones. Although during replication it would be possible to copy only occupied blocks, and when using snapshots, do not copy (see copy-on-write) the free block into snapshot, but simply mark it not occupied there. This replication technology is implemented in 3PAR arrays.

To solve such problems, thin provisioning and thin reclamation were invented, which we will discuss in more detail.
How thin provisioning works
The concept of thin provisioning is simple and consists of the following:

- At the time of creating a logical volume (LUN) on the disk array does not fully allocate the entire volume of the volume. The LUN LBA -> Backend physical address matching table is initialized. The storage system administrator indicates the maximum possible size of the volume and the volume fullness threshold at which it will receive a warning.
- The allocation of new data blocks for a logical volume occurs as the volume is filled.
- When the server releases the data blocks, it must report the freed blocks to the storage system in order to return them to the common pool. The technology is called thin reclamation and is described below.
- When the server requests the size of the volume (SCSI Read Capacity), the storage system gives the maximum size of the volume that the storage administrator has installed.
- The sum of the maximum volumes of all volumes on a storage system may exceed the physically available space on the storage system.

Based on the foregoing, it is easy to provide a schema for the work of thin provisioning. When the storage system receives a SCSI Write command (encapsulated on an FC, SAS, iSCSI, etc.) stack, it allocates another piece of data and writes data from SCSI Write there. In the case of 3PAR, blocks are allocated in the size of 16K.
How thin reclamation works
And now we will discuss much more interesting and non-obvious points - how the host interacts with the storage system to return the free disk space to the common pool. The interaction between the host and the storage system is an extremely important nuance, since only the host knows which blocks can be deleted and which not. The thin reclamation technology was first implemented on 3PAR arrays and today is an industry standard approved by the International Committee for Information Technology Standardization (INCITS). The document is called T10 SBC-3 and expands the SCSI standard with new commands to interact with storage systems (these commands were added in the eighteenth revision of the document on February 23, 2009). There is a similar standard for ATA / SATA devices.
To implement thin provisioning, the standard provides for 3 SCSI commands:
- UNMAP
- WRITE SAME
- GET LBA STATUS
The standard requires all storage systems with thin provisioning to support at least the UNMAP command or the WRITE SAME command with the unmap bit. Consider the API described by the protocol.
UNMAP
Tells the storage system to release one or more Logical Block Address (LBA) sequential groups. The storage system should mark LBA data as free (unmapped, in terms of SCSI), free up space on the backend and background process to erase the data that was previously there in case these blocks are then allocated to another host. In this command, only service information is transmitted in the form of a set of pairs consisting of “LBA Address” and “Number of Logical Blocks”.

WRITE SAME
If for some reason the host does not want to use the UNMAP command, it can get a similar effect with the WRITE SAME command. The unmap bit field is provided for this. If the WRITE SAME command with the unmap bit set comes to the array with thin provisioning and the volume on the array is thin, then the array will do the same as in the case of the UNMAP command. It differs from the UNMAP (42h) command in that using WRITE SAME it is not possible to specify a large number of blocks for release. You can specify only one pair of "LBA Address" and "Number of Logical Blocks".
Also, do not forget that the WRITE SAME command is primarily a command for recording data. In the event that the unmap bit is not set, the storage system does not support thin provisioning, or the volume is thick, then the usual write operation will be performed using the specified LBA. From this it follows that in these cases, SCSI READ must return exactly the data that was written there. Here, some manufacturers like the same Hewlett are cunning, and instead of sequentially writing data of the same type (for example, zeros), these blocks are marked in the logical volume metadata as allocated but “filled with zeros”. This technology is called zero detection.

GET LBA STATUS
This is a service operation (device-specific) and it uses the command code SERVICE ACTION IN (9Eh). It allows you to find out the server:
1. Does volume thin provisioning.
2. The status of a specific unit on the storage system (whether real capacities are allocated for it on the backend or not).
3. Granularity thin provisioning for volume.
4. Limits (alarm level and maximum volume).
The command is very useful, for example, for background searches from the host side for blocks allocated on an array but not used by the host for data storage or when moving from thick to thin volumes.

As a conclusion.
I am very glad that you have read to the last lines! Unfortunately, I didn’t say anything about file storage, database and OS support for thin provisioning, I didn’t tell when it makes sense to use it at all - and this is very interesting in my opinion, but unfortunately a voluminous topic. Maybe I will come back to her later.