Table bloat? No, have not heard…

I think many people know the feature of PostgreSQL, which leads to the effect of inflating tables, or table bloat. It is known that it manifests itself in cases of intensive updating of data, both with frequent UPDATE and with INSERT / DELETE operations. As a result of such inflation, performance is reduced. Consider why this happens and how to deal with it.

Tables in PostgreSQL are presented in the form of pages, 8Kb in size, in which records are placed. When one page is completely filled with records, a new page is added to the table. When deleting records using DELETE or changing using UPDATE, the place where the old records were can not be reused immediately. To do this, the autovacuum cleaning process, or the VACUUM command, runs through the modified pages and marks this space as free, after which new entries can be safely recorded in this space. If autovacuum does not cope, for example, as a result of actively changing more data or simply because of poor settings, then new pages will be unnecessarily added to the table as new entries become available. And even after the cleanup reaches our deleted posts, the new pages will remain. It turns out that the table becomes more sparse in terms of the density of records. This is called the spreading table effect, table bloat.
')
The cleaning procedure, autovacuum or VACUUM, can reduce the size of the table by removing completely empty pages, but only if they are at the very end of the table. In order to minimize the table in PostgreSQL, there is VACUUM FULL or CLUSTER, but both of these methods involve the installation of heavy and long locks on the table, which is not always the right solution.

Consider one of the solutions. When updating a record using UPDATE, if there is free space in the table, then the new version will go to the free space without selecting new pages. Preference is given to free space closer to the top of the table. If you update the table using the so-called. fake updates, type some_column = some_column from the last page, at some point, all records from the last page will go to the free space in the previous pages of the table. Thus, after several such operations, the last pages will be empty and the usual non-blocking VACUUM will be able to cut them off from the table, thereby reducing the size.

As a result, with the help of such a technique it is possible to compress the table as much as possible, without causing any critical locks, and therefore without interfering with other sessions and with normal database operation.
And now the most important)))) To automate this procedure, there is a utility pgcompactor.

Its main characteristics are:

does not require any dependencies except Perl> = 5.8.8, i.e. you can simply copy pgcompactor to the server and work with it;
works through DBD :: Pg, DBD :: PgPP adapters or even through the standard utility psql, if the first two are not on the server;
processing both individual tables and all tables within a schema, database or the entire cluster;
the ability to exclude databases, charts or tables from processing;
analysis of the effect of inflating and processing only those tables in which it is present; for more accurate calculations, it is recommended to install the pgstattuple extension;
analysis and rebuilding of indices with the effect of inflation;
analysis and rebuilding of unique constraints (unique constraints) and primary keys (primary keys) with the effect of inflating;
incremental usage i.e. you can stop the compression process without harming anything;
Dynamic adjustment to the current database load, so as not to affect the performance of user queries (with adjustable at startup);
recommendations to administrators, accompanied by ready-made DDL, for rebuilding database objects that cannot be rebuilt automatically.

A couple of examples of use:
running on the entire cluster with mandatory rebuilding of indices:
# pgcompactor --all --reindex

run on a separate table (with rebuilding indexes):
# pgcompactor --reindex --dbname geodata --table cities --verbose info> pgcompactor.log 2> & 1

Result of work:
Table size decreased from 9.2GB to 5.6GB. The cumulative size of all indices dropped from 7.5GB to 2.8GB

Project URL: github.com/PostgreSQL-Consulting/pgcompacttable

Many thanks to the authors of the utility for their work! This is a really useful tool.

Thanks for attention!

Source: https://habr.com/ru/post/169939/

All Articles

Table bloat? No, have not heard…

More articles: