📜 ⬆️ ⬇️

Thin Provisioning - “credit card” for storage

image

About 20 years ago, the concept of SAN - Storage Area Network or a specialized network of storage devices became part of the life of IT departments of companies. The use of centralized storages that host disk partitions used on application servers connected via special high-speed protocols has significantly increased the efficiency and flexibility of disk space utilization.

The reasons behind the creation of a SAN were, in part, the same as those previously behind the creation of an ordinary local area network (LAN) and shared devices in it. Instead of giving each client computer a laser printer that will be idle 99% of the time, it’s better to give it access to a shared printer, albeit more powerful and expensive, but used with a higher average load, and thereby increase the level of resource utilization and its cost effectiveness.
Including the use of "shared disk resource" has solved the problem of "under-use". If today the minimum available SATA disk is usually at least 500GB, then if we don’t use it to store video downloaded from torrents, then usually, in ordinary use, it is 90-95% empty. A typical OS installation takes 2-8 GB, the application and its data are still somewhat, often several gigabytes, yet the rest of the space remains free, but alas, inaccessible to possibly needing tasks on other computers.

')
So it was until the Storage Area Network - SAN - a special network for storing and transmitting data. This network uses a special high-speed data transfer protocol, such as FC or iSCSI, which allows you to use disk space from a centralized storage device without significant loss of availability and speed, cutting and distributing to consumers the total size of the space they need, as if they were local disks of the computer, rather than partitions on a large shared storage.

However, partly deciding to use the SAN problem of provisioning storage space for applications, we still face it again, though at a new level.

The fact is that in practical life the system administrator never “cuts” a section to exactly the requested or occupied space, because you always need space for a sudden increase in volumes, base growth, in the case of a database, a place for logs for a web server, and etc. The law of nature today is that the volumes of information grow, and sometimes grow exponentially.
Therefore, it is not surprising that a very significant place is distributed on disk partitions in a SAN (usually they are called LUNs), but is not occupied by the data of the corresponding application, however, for other more needy applications, it is no longer available.

image

In order not to start every day with increasing data partitions, and not to encounter a task that suddenly “fell” due to the exhaustion of space, any storage system administrator creates data partitions on the SAN with a margin, sometimes significant, in the results of the corresponding study I came across a mention that up to 60-80% of the space on SAN storages is distributed “for the future”, but nothing currently occupies a place.

Thus, just by finding and using the method to dynamically allocate space on the disks of a SAN system, we can, in general, save up to 60-80% of the space, moreover, expensive, high-speed disk space, uselessly “buried” at the moment in “chopped "LUN-ah" in reserve.

The method of allocating storage space to applications not immediately when creating a disk, but as the demand for the application data, “on demand”, has become known as “thin provisioning”. Unfortunately, there is still no generally accepted translation in Russian, so I prefer to call it “economical distribution”.

Surprisingly, you are widely familiar with the principle of thin provisioning in everyday life.
For example, a loan in a bank works. When a bank issues ten thousand credit cards with a credit limit of 500 thousand, it does not keep five billion dollars as collateral in the accounts, as it expects card users to not immediately spend all the money provided by the loan, otherwise the bank simply would not have enough assets . However, when it is necessary, you will be able to use the sublimit amount if the total “pool” of bank funds is not exhausted.
Plumbing and electric companies also work, giving you the resource that they believe that the entire multi-storey building will not open taps or switch on washing machines and electric kettles at once, and at the expense of more flexible consumption it is possible to save on the cost of infrastructure resources and capacity.

Similarly, multitasking OS and virtualization systems work. You can use many programs at the same time, with the condition that they do not want to simultaneously load the processor at the same time. Shared processor resources are allocated to the application as needed, but each program at the same time “feeds the illusion” that all 12 cores of your new server are available to it. This is true, but not for everyone.

Similarly, the situation is when using thin provisioning. If the program wants to have a 50GB data partition (even though it currently has only 10GB of data on it), then we can provide it with a partition that the application sees as a 50GB partition, but 40 of them will be “virtual” without taking up space on the system disks at this moment until real data are recorded in them. This will allow us not to “lock” the place “in reserve”, but to use it as the need arises.

As far as I know, the first thin-provisioning principle in SAN began to be used in their storage systems by the newly purchased HP company 3Par, in which this possibility was the key (and almost the only) “trick” of their systems, however, among the first, thin provisioning implemented and NetApp in their FAS systems. There should be no surprise for you that it is so simple and fast (in fact, with the new release of the internal OS update) it appeared in already existing systems because, not once again, a good word, created in 1993 by NetApp "WAFL file system", underlying all NetApp storage systems made it very easy and simple.

ALL free disk space on storage systems using thin provisioning is available to increase the space for ALL LUNs, any application. Disk space on a networked storage system becomes a truly shared resource.

image

Do not get captured by the not quite correct visualization; if application A increases in the occupied volume, data from other applications does not move across the disks, just additional sections of disk space allocated in the free space area are allocated and added to application A section. Theoretically, this increases data fragmentation, but the almost negative effect of it can be neglected, since space is allocated with relatively long pieces of megabyte order, and, as practice shows, the difference in performance between thin and thick data is usually “ below the measured level ”.

Thus, the use of thin provisioning solves the problem of inefficient allocation of space in the SAN, saves space, simplifies the admin procedures for allocating space to applications in the storage, and uses the so-called oversubscribing , that is, allocating more space for applications than we have physically, in a reasonable way. that application space is not required at the same time. As the need arises in it, we can later easily increase the physical storage capacity.

Source: https://habr.com/ru/post/108534/


All Articles