How not to lose data in PostgreSQL

PostgreSQL offers several options for backing up data. About all of them have already been told more than once, including in the habr. But basically it tells about the technical features of the methods. I want to try to tell you about the general backup strategy, combining all the methods into an effective system that will help you save all the data and reduce the number of dead nerve cells in critical situations.
Input data: PostgreSQL 9.2 server, Base size> 100Gb.

Backup options

As follows from the manuals , there are 3 methods for backing up data:

Stream replication
Copying DB Files
SQL Dumps

They all have their own characteristics, so we use all these methods.

Stream replication

Configuring stream replication is well described in articles here and here . The point of this replication is that if the main server falls, then the slave can quickly be made a master and work with it, because on the slave is a complete copy of the database.
In stream replication, WAL files are important. These are files, from where the slave pulls in the missing data, if there are no more on the master. From here there is a need to keep these WAL files longer. We store these files for 8 days.
The directory with WAL files must be accessible to both the master (for writing) and the slave (for reading). To ensure this, we created a glusterFS-based shared storage, which was mounted on both servers. Thus, firstly, greater reliability is achieved (when the wizard drops, wal-files will be available for the slave), and secondly, it is possible to quickly create additional slaves, which also need these wal.
The bottom line: streaming replication protects against the failure of the main server, while almost no data is lost.

Copying DB Files

With the server crashed, we figured out, now let's deal with the human factor, when for some reason the data in the tables are deleted, either the tables themselves or the database of the baths. In this case, you will have to restore the data from the backup. Another copy of the database may be required to test the application. Such a copy can be done in two ways - dump the database or copy the entire directory with the data. For a long time, we used the first option - dumped the database to a file. But the dump has a big minus - the ~~process locks the tables, and other processes can no longer work with them~~ ( UPD : pg_dump does not lock the tables. However, it creates a large load on the database). For large databases, this is critical.
Now we use the pg_basebackup utility to back up the database, which essentially copies all the database files into one large archive.
We keep 4 weekly copies and 6 months. Copies are stored on the same GlusterFS repository. We create copies in such a way that they are self-sufficient, that is, they work immediately after deployment, without the need to upload additional WAL files. Therefore, we can easily deploy a database, for example, three months old.
It is noteworthy that the pg_basebackup utility can be run (under certain conditions) on the slave server, so creating backups does not in any way load the wizard.
In order for pg_basebackup to work on the slave, you need to enable saving of WAL files, to do this, set options in postgresql.conf:

wal_level = hot_standby wal_keep_segments = 1000

1000 is the number of wal-files that postgreSQL stores on disk. You may need to decrease or increase this parameter. The fact is that pg_basebackup simply archives the contents of the database, however during archiving some data will already change, and this changed PostgreSQL data will later be pulled from WAL files when restored. For this pg_basebackup saves to the archive also all existing WAL files. Thus, in order for everything to be successful, it is necessary that all WAL files be available from the moment pg_basebackup starts working until it is completed. If the required WAL files are deleted, pg_basebackup will fail with an error. The desired amount of wal_keep_segments can be determined empirically.
To create a backup copy of the database, we run pg_basebackup with the following parameters:

 /usr/bin/pg_basebackup -U postgres -D /tmp/pg_backup -Ft -z -Xf

-F tells us to save everything to one file, -z to archive, -Xf to include WAL files in the archive. Without the inclusion of WAL files, the backup will work, but when restoring Postgres it will be required to provide the missing WAL files.
')
Backup we do on the crown on Saturdays like this script:

 #!/bin/bash mkdir /tmp/pg_backup /usr/bin/pg_basebackup -U postgres -D /tmp/pg_backup -Ft -z -Xf WEEK=$(date +"%V") MONTH=$(date +"%b") let "INDEX = WEEK % 5" test -e /collector/db-backup/base.${INDEX}.tar.gz && rm /collector/db-backup/base.${INDEX}.tar.gz cp /tmp/pg_backup/base.tar.gz /collector/db-backup/base.${INDEX}.tar.gz test -e /collector/db-backup/base.${MONTH}.tar.gz && rm /collector/db-backup/base.${MONTH}.tar.gz ln /collector/db-backup/base.${INDEX}.tar.gz /collector/db-backup/base.${MONTH}.tar.gz test -e /collector/db-backup/base.last.tar.gz && rm /collector/db-backup/base.last.tar.gz ln /collector/db-backup/base.${INDEX}.tar.gz /collector/db-backup/base.last.tar.gz rm -r /tmp/pg_backup

This creates a file with 3 suffixes: with the number of the week, with the name of the month and the official last.
To restore the database from a copy, you need to stop PostgreSQL, delete (or transfer) the old data from the data directory, unpack the contents of the archive and start the server. There is one interesting point. Since pg_basebackup was done on the slave, the recovery.conf file will also be unpacked with the data. So, if we want to restore data to the last possible state, then this file must be left, in this case, after starting the server, Postgres will start pulling WAL files from the location specified in recovery.conf. If you take the last weekly backup, then we will have all the necessary WAL files (since we store them for 8 days), and we can restore the database to the last normal state.
If we need data as at the time of the backup, then before starting the database, we need to delete the recovery.conf file.

We also use backup copies for test purposes, and the correctness of creating a backup is checked for one.
The name base.last.tar.gz is used by us to restore the latest copy to the test database. Our test database is restored every night with this script:

 #!/bin/bash /etc/init.d/postgresql stop rm -r /data/* tar -zxf /collector/db-backup/base.last.tar.gz -C /data/ rm /data/recovery.conf /etc/init.d/postgresql start

The bottom line: copying the database at the file level will protect against software failures and human errors. When restoring the database from the backup, the latest data will be lost.

SQL Dumps

We haven't done a full SQL dump of a large database for a long time, but we do dump dump tables, 1 table - 1 file. This allows you to quickly restore data in cases where only one table is corrupted - no need to unpack the entire database from the backup.
PostgreSQL provides us with statistics on the number of changes in the tables, and every night we look at which tables have been changed and dumped. Statistics look like this:
select schemaname,relname,n_tup_ins+n_tup_upd+n_tup_del from pg_stat_user_tables ;
The bottom line: a SQL dump will help recover minor errors. In this case, the data will be relevant at the time of the dump.

In conclusion

As you can see, in order to protect yourself as much as possible against data loss, you can and should use all the features of PostgreSQL, especially since it is not so difficult.

Source: https://habr.com/ru/post/197742/

All Articles