📜 ⬆️ ⬇️

Ways to use PostgreSQL more efficiently

After reading the article Databases in online games and especially comments to it, I was once again upset by the idea that many developers change the database in their project, trying to solve their problems with this shift, without exhausting all the possibilities offered by the replaced database. I take part in the work on the project, the database of which is characterized by:

For historical reasons, as the saying goes, the whole base is, ahem, a single base, without a sharding, which could solve a number of problems. The more acute is the question of finding solutions that lie precisely in the features of PostgreSQL. Solutions that can be implemented with little blood.

I will describe some of the problems we face and the solutions applied. If someone finds it useful - for God's sake. If someone corrects, I will only be happy to discover the vulnerability in the approaches used and to increase the efficiency of the database. The described can be applied without interfering with the application logic, solely by modifying the database schema.

We will talk about PostgreSQL 9.2 (9.3 have not yet felt). And, yes, there will be no admin tips for optimal configuration of configs - I would be glad to know how to sharpen them correctly.

Captain recommendations like “create the right indexes” will also be missing. I assume that these nanotechnologies lie on the surface.
')

Large table and index sizes


It is not a secret that the more intensively the data is written to the database, the faster the speed of sampling decreases on average and the base begins to “stupid”. This is directly related to the transactional essence of the database. Since strings (tuples) have versioning, when a value changes in a certain string, this value cannot simply be taken and written instead of the previous one. PostgreSQL instead duplicates the string and writes its new, modified version separately. That is, roughly, if you take a table in 1GB and make UPDATE any field a single operation, then the table “suddenly” will increase in size to 2GB. The same thing happens with indexes. Increasing the size of the database immediately affects the speed of query execution.

Of course, AUTOVACUUM will work regularly (if configured), well, or you can run VACUUM as needed.

What can be done to mitigate the problem?


Slow samples



Transaction Counter Exceed


The operation of the MVCC mechanisms would be impossible without the existence of a transaction counter. The mystery is why this 32-bit counter is still, but we have what we have - every 2 billion with a tail of transactions, the counter is supposed to be reset. And in order to avoid the irreparable, by this point the transaction numbers of all rows in the database should be reset. More precisely, some reserved FrozenXID is assigned to these lines. Tellingly, AUTOVACUUM and manual VACUUM are not engaged in FrozenXID installation and for normal operation of the database, when the transaction counter reaches a certain config-value, an auto-vacuum is started with a beautiful comment “to prevent wraparound”. With large tables (tens of GB), this process can take weary hours, during which the table will be unavailable for either reading or writing.

It is not possible (to me) to do something here without disturbing the logic and architecture of the application. It is necessary either to reduce the number of transactions, enlarging them (which may result in other problems and generally ineffectively), or break the database into pieces, which also does not pull the operational way of solving the problem.

Long transactions


This is fairly obvious, but just in case I will mention.

We are not talking about long transactions per se, but about situations when the application did not close the transaction because of the bug and left to do its business further (for example, background task handlers capable of hanging for days). In the list of active PostgreSQL processes, such a connection will hang with the status “idle in transaction”. Regularly working out VACUUM frees up the space occupied by the old versions of the rows (remaining after editing or deleting), referring to the transaction number in which the deletion or editing was performed. Roughly speaking, if there were rows deleted by transactions 1 and 2, and the minimum number among the current open transactions is 3, then these two lines can be deleted “for real”, because they do not fall within the scope of any current open transaction. In the case of a pending transaction, VACUUM will not be able to delete a single row that has been edited after the start of this transaction, as a result of which the base can grow horribly in a few hours.

There is only one advice - to monitor current processes and catch villains with "idle in transaction".

Durable blocking backups


Replicas, for all their attractiveness, do not save from situations where the base was beaten, for example, by a curved patch. They made a patch for some sorting of objects there, all the good replicated - and hello. Therefore, backups, anyway, are needed. We have a large base backup for about an hour. Practically all this time the project is inactive. Here, frankly, I can not advise anything. On the contrary, I will be glad to hear who and how to cope with the situation. The only thought is to beat on shards and backup with chunks, thinking about how to preserve the consistency of the data.

In version 9.3, noticeable work was done on representations that can also allow optimizing work with the database, but this is a separate topic.

Source: https://habr.com/ru/post/214183/


All Articles