One of our rather large clients, in whose electronic document management system more than 10,000 users work every day at a time, so-called sparse columns or sparse columns were used.
The article is an attempt to reduce the prerequisites and the results of applying this functionality (and some other DBMS settings) in a single place.
Problems and background
To dive into the subject a couple of words about the system: the system is a product whose development began in the 2000s. At the moment, the system is actively developing. The product has a client-northern architecture with several application servers.
Microsoft SQL Server DBMS is used as server side.
')
Taking into account the fact that the system is no longer “new”, in the database structure there are corresponding mechanisms / options / objects, the use of which at the current moment looks unreasonable and outdated. Gradually, there is a rejection of these objects, but there are situations when they are still used.
Premise # 1
At the next performance audit, together with the Customer, we paid attention to the rapid growth of one of the tables (let's call it table X). The volume of table X is almost over 350 GB (by the way, the volume of the entire database is about ~ 2TB). The distribution according to the actual data of the table and the indices was as follows:
- data was about 115 GB
- the rest of the volume ~ 235 GB accounted for indices.
Those. The situation is quite uncommon, when the indexes on the table are about ~ 2 times larger than the data itself. Those. we get quite high overhead costs, which in turn negatively affect:
- the duration of the insert / update data in this table (the more indices, the “more expensive” the operation);
- the duration of servicing operations (maintenance) of these indices;
- the duration of the backup and restore the database in case of failure;
- disk space requirements are increasing in terms of volume.
Premise number 2
In a large way, the DBMS operation can be described as follows: all data is loaded from disks to the buffer pool (cache) before processing. This reduces the number of disk operations and speeds up the processing of the most frequently used data. In more detail with the mechanism can be found, for example, in the
article . The efficiency of using the buffer pool can be indirectly monitored using the Page Life Expectancy counter - the lifetime of the page in the buffer pool.
In the interval of several months, they revealed a negative trend in reducing the lifetime of the page in the buffer pool.
Inaction could lead to:
- a significant increase in the load on the disk subsystem;
- increase the duration of user operations.
As a result, it was decided to conduct an analysis of the causes.
Since the system in these months was not static and was constantly modified, we decided to start the analysis with the contents of the buffer pool. For this, we used the data of the dynamic view:
sys.dm_os_buffer_descriptors .
Request example:
Into_BufferPoolSELECT indexes.name AS index_name, objects.name AS object_name, objects.type_desc AS object_type_description, COUNT(*) AS buffer_cache_pages, COUNT(*) * 8 / 1024 AS buffer_cache_used_MB FROM sys.dm_os_buffer_descriptors INNER JOIN sys.allocation_units ON allocation_units.allocation_unit_id = dm_os_buffer_descriptors.allocation_unit_id INNER JOIN sys.partitions ON ((allocation_units.container_id = partitions.hobt_id AND type IN (1,3)) OR (allocation_units.container_id = partitions.partition_id AND type IN (2))) INNER JOIN sys.objects ON partitions.object_id = objects.object_id INNER JOIN sys.indexes ON objects.object_id = indexes.object_id AND partitions.index_id = indexes.index_id WHERE allocation_units.type IN (1,2,3) AND objects.is_ms_shipped = 0 AND dm_os_buffer_descriptors.database_id = DB_ID() GROUP BY indexes.name, objects.name, objects.type_desc ORDER BY COUNT(*) DESC;
With a volume of the buffer pool of ~ 185 GB of the order of 80-90 GB, we compiled the cluster index data of our problem table X. The volume of the rest of the buffer pool was distributed fairly evenly between the indices. From this it followed that the maximum effect could be obtained by optimizing in some way the data of table X (in this case, we are talking about its clustered index).
Premise number 3
Practice shows the accumulation of large amounts of data in a single table will sooner or later affect the performance, if not all, then at least part of the operations associated with this table. The situation is non-linearly aggravated with a large number of columns in this table.
In addition, when we analyzed table X for data filling, we saw the following picture: for almost all rows, only a certain set of columns was filled (due to which system flexibility and adaptation to specific business requirements are achieved). What essentially leads again to the low efficiency of data storage and processing, since some cells do not store information, but nevertheless, space for these cells is reserved (for example, adding an empty column with data type int will increase the cost of storing the table by at least [4 bytes * number of rows in the table]).
Solutions / Corrections
Taking into account all the initial data presented above, 4 directions were identified for further development:
- filtered indexes;
- data compression by means of DBMS (data compression);
- sparse columns;
- partitioning table X into several smaller tables.
Filtered Indexes
Here is what the official documentation tells us:
“The filtered index is an optimized nonclustered index, particularly suitable for queries that select a well-defined data subset ... A well-designed filtered index improves query performance and also reduces the cost of maintaining and storing indexes compared to full-size indexes .
”If we say a little simpler, then we are talking about the possibility of creating an index only for a part of the data in the table, for example, we can create an index in table X for a specific business case.
But to apply the index, it was necessary to use a new version of the software, in which the database structure was changed, including. In particular, in the new version, the values of the parameters of the client software connection from the DBMS were changed to ON:
But in our case, the update was planned in six months, and we could not wait so long. Moreover, it was also not intended to use filtered indexes, since this made ineffective, for example, using the option of forced parameterization.
Data compression
Since the client has installed the DBMS version - 2012, data compression for this version is possible of two types:
- page level compression (page compression);
- row-level compression.
If we consider the version of SQL 2016, there are some
changes , but in our case they were also irrelevant (switching to the next version of SQL on a powerful hardware is quite expensive from a financial point of view). Therefore, we stopped at the first two in more detail.
According to the
documentation , page-level compression is a more resource-intensive operation for the CPU than line-level compression. Based on this page-level compression option was dropped immediately.
Next was an attempt to use row compression, but the documentation also stumbled upon the mention that even it consumes additional resources. And since the processor is a very limited resource, this option also had to be abandoned.
Sparse columns
Sparse columns are regular columns that have an optimized storage for NULL values. Sparse columns reduce the space required to store NULL values, but the cost of getting values other than NULL increases.
To achieve a positive effect in each particular column, there must be a certain percentage of NULL values. This percentage depends on the type of data in the column, for example:
Data type | Byte Spare | Sparse bytes | NULL percentage |
---|
float | four | eight | 64% |
datetime | eight | 12 | 52% |
varchar | eight | 12 | 52% |
int | 2 | four | 60% |
Moreover, not every column of m. transferred to sparse. A list of limitations and incompatibilities is given in the
official documentation .
Those. To assess the possibility of transferring to a sparse in a large way, it was necessary to analyze
- presence of restrictions from the documentation on a specific table / column;
- the real percentage of NULL values in these columns;
This was helped by a request, the text of which is available at the link below. The request itself for large volumes takes a long time, it is recommended to indicate the specific table that you need to analyze.
Next, from the resulting list, it is necessary to determine the columns that satisfy our conditions as much as possible (with max NULL values) and change them to sparse. The change itself is best done in single_user database mode to eliminate the occurrence of long locks. After converting a column to sparse, you need to perform an index rebuild, only after that you can see the resizing of the table.
I emphasize that the use of the mechanism does not increase the load on the processor of the DBMS server (it was checked both in practice and load testing).
Perhaps this is one of those rare situations that can be briefly depicted with the following pictures:

Splitting a table into smaller ones
In this case, the work could also be divided into blocks:
- making changes to the system architecture;
- modification of all applied development for a new architecture.
Both items are quite expensive and the separation of the table was the last in the list of possible optimizations. All this could be delayed indefinitely.
By the way, the implementation of this item is currently not required. Good or bad in terms of the development of the product as a whole, I think time will tell ...
Sparse effect
Well, instead of conclusions, I would like to note in numbers the positive aspects of using sparse columns:
- reduced the volume of the cluster index of table X by ~ 2 times (a similar effect on reducing the volume of MB during the re-creation of nonclustered indexes with key-fields, which were translated into sparse);

- Based on claim 1, the efficiency of using the buffer pool has increased, since decreased the amount of data in table X in the buffer pool;

- based on p.1-2, the page lifetime in the buffer pool has increased (blue line), and as a result, the load on the disks has decreased (yellow line);

- reduced the duration of the part of operations associated with large amounts of data sampling, because increased probability of finding data in the buffer pool;
If we reduce the data in the table we get:
Indicator | Improvement | Note |
---|
Clustered index volume of table X, GB | 2 times | On disk (HDD) |
Clustered index size of table X in the buffer pool, GB | 2 times | In Memory (RAM) |
Page Life Expectancy, sec | 2 times | The lifetime of the page in the buffer pool |
Disk Transfers / sec, iops | 1.6 times | The number of disk operations. The load on the storage system is reduced. |
Sparse restrictions
However, in addition to the advantages, there were also restrictions:
- sparse columns should be updated periodically. After some time, the distribution of NULL and non-NULL values in sparse columns may change and the use of sparse will be unjustified;
- The number of columns that can be translated in sparse is limited. In case of exceeding when updating lines, error 576 may occur.
Detailed list of restrictions can be found
here .
If you have read the article to the end, then before performing any of the settings presented above, remember that the author of this article does not bear any responsibility for the possible loss or damage to your data. Do not forget to make backups.
Perhaps someone has already applied the settings described in the article on high-load systems, unsubscribe to what results you came.