📜 ⬆️ ⬇️

Watchdog for replication in PostgreSQL 9

Greetings. I want to share one samopisny crutch, maybe someone will be useful.

Briefly about the main thing


We simulate a situation: there is a cluster of PostgreSQL servers — a master and n-replicas. There comes a black day and one (or several) replicas falls. The reasons are unimportant - the piece of iron died, the cleaner cleaned the wire with a mop, or the UFO temporarily zohavalo server. The result is one - if the replica has been lying for a long time, then she herself will never catch up.

The reason is the PostgreSQL replication process itself. As data is received, the journal is discarded to disk (by default - chunks of 16 MB each). In the wizard config we use the wal_keep_segments option, which indicates how many last XLOG pieces we will store. Let the parameter be equal to 16. This means that in the pg_xlog directory there will be 16-log files at the same time, and when the 17th appears, an attempt will be made to send the 1st to the archive (in other words, there is a constant rotation). And if your replica falls behind more than these 16 files - when you try to catch up with the master, in the log you will see a message stating that the necessary piece of XLOG cannot be obtained because on the master he has already gone to the archive. And all because the default backup command, as a rule, looks like this:
')
cp %p /var/lib/postgresql/9.0/main/archive/%f

As you can see, there are no checks on whether the file archiving will hurt anyone.
In essence, there is nothing fatal in this - if the required archive file has not been deleted, you just need to slip it onto the cue in the right place and the file itself will “roll”. The unpleasant moment is that, depending on the load, to copy, as a rule, not one file, but tens / hundreds / thousands. In addition, the process must be monitored because while you are copying the same files, the wizard archives the new ones. In the worst case, you can be saved by a complete replay of the replica.
In order not to encounter such joys, I wrote a small script on a mixture of Perl and Bash. I will say in advance that in no case do I pretend to be the best programmer - the main condition was not the beauty of the code, but the execution of the necessary actions. So, the script is here , but for now I’ll tell you exactly what it does.

How it works


master01:~# ps aux|grep postg|grep sender
postgres 26132 0.0 0.0 6518660 3052 ? Ss 20:21 0:01 postgres: wal sender process postgres 192.168.12.1(36254) streaming 153/ED957E68
postgres 26133 0.0 0.0 6518660 3056 ? Ss 20:21 0:01 postgres: wal sender process postgres 192.168.12.2(51907) streaming 153/ED957E68
postgres 26135 0.0 0.0 6518660 3060 ? Ss 20:21 0:01 postgres: wal sender process postgres 192.168.12.3(39404) streaming 153/ED957E68
postgres 29142 0.0 0.0 6518724 3084 ? Ss 20:44 0:01 postgres: wal sender process postgres 192.168.12.4(51897) streaming 153/ED957E68
postgres 29320 0.0 0.0 6518724 3084 ? Ss 20:45 0:01 postgres: wal sender process postgres 192.168.12.5(49234) streaming 153/ED957E68
postgres 29453 0.0 0.0 6518724 3084 ? Ss 20:46 0:01 postgres: wal sender process postgres 192.168.12.6(35519) streaming 153/ED957E68


As we can see from this simple command - in the list of processes you can find:
1. The number of connected replicas, with their ip-addresses.
2. A piece of XLOG, which is currently being processed by them.

And we, in our turn, as people who manage this construction, know how many replicas we have and what their addresses are. Pay attention to the last part of the lines - this is the XLOG position for a particular replica. If we remove the slash from this value and cut out the last 6 characters from the resulting one, we will have 153ED. This is a hexadecimal number.

And in the list of processes you can find this:

master01:~# ps aux|grep postg|grep arch
postgres 556 0.0 0.0 66184 1668 ? Ss Oct11 0:23 postgres: archiver process last was 0000000100000153000000EB


The last part of the line is the file that was archived. We take away the last 13 characters, we remove from them 6 zeros coming in a row. We get 153EB.

We convert the resulting numbers to decimal (for clarity), we get 87021 and 87019. Comparing these two numbers, we learn that replicas use a log that is older than the last archived by 2 steps - therefore, we can safely delete one fragment.

Based on these simple manipulations, my script works:

1. The administrator adds a list of replicas to /etc/postgresql/9.0/main/slaves.list (one per line).
2. The script fits in as archive_command in the PostgreSQL config.
3. When you try to archive a piece of XLOG, the following happens:
3.1. Checks whether the number of connected replicas corresponds to the number specified in the config.
3.2. If everything is good, we’ll see if the hosts match the ones specified in the config file (this is done in case we expand the cluster and we forget to write a new replica to the config).
3.3. We calculate the minimum position of the XLOG among replicas (we are looking for the most lagging behind).
3.4. Compare the numbers from the file name to be archived and the number from the previous paragraph.

If all conditions are passed - copy the file to the archive and return 0. If something went wrong - return one - PostgreSQL will consider the attempt unsuccessful and after some time will take the next one. You can follow the process in /var/log/postgresql/9.0/watchdog.log

Achtung! It is assumed that the list of replicas will indicate the existing DNS records (made for personal convenience), or the names specified locally in / etc / hosts. To disable this, you need to comment out the lines from 47th to 55th.

Source: https://habr.com/ru/post/130704/


All Articles