📜 ⬆️ ⬇️

Replication in Postgresql 9.0

Good day. Considering that since the release of PostgreSQL 9, a certain amount of time has already passed - I decided to touch one of its new features - native replication. As you know, the new mechanism is based on sending XLOGʻa from master to slave. One of the fat advantages can be called a normal processing ALTER `s. In other words, the 9th version administrator can do without Slony.

We assume that the packages have already been installed (and if not, for Debian / Ubuntu you can get them here ), the testdb database has been created. I describe the process assuming that the postgres`a databases are in /var/lib/postgresql/9.0/, if this is not the case, use the correct path for you.

So, let's go:
1. We configure listen && pg_hba.conf
Suppose the ip master has 192.168.0.1, and the slave has 192.168.0.2. Then the listen string in postgresql.conf will look like this:
listen_addresses = '192.168.0.1'

And, in pg_hba.conf on the master there will be such an entry:
host replication postgres 192.168.0.2/32 trust

2. We include everything that is necessary for replication on the master
# , . hot_standby archive ().
wal_level = hot_standby

#
max_wal_senders = 2

# ? - , .
wal_keep_segments = 32

# ( , , ). , . , .
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/9.0/main/archive/%f'

Now the wizard needs to be restarted.
')
3. We send the database from the master to the slave.
We need something that can send data over the network. I used rsync, although of course you can use any other tool. Stew on the postgresql slave, and then do the following on the master:
$ psql -c "SELECT pg_start_backup('label', true)"
$ rsync -a /var/lib/postgresql/9.0/main/ slave:/var/lib/postgresql/9.0/main/ --exclude postmaster.pid
$ psql -c "SELECT pg_stop_backup()"


4. Turn on hot_standby on the slave
Add to postgresql.conf:
hot_standby = on

5. Create a replication config on the slave.
To do this, in the file recovery.conf (you need to create it), which is located in /var/lib/postgresql/9.0/main, we write the following:
standby_mode = 'on'
primary_conninfo = 'host=192.168.0.1 port=5432 user=postgres'
trigger_file = '/var/lib/postgresql/9.0/main/trigger'
restore_command = 'cp /var/lib/postgresql/9.0/main/archive/%f "%p"'

I will separately tell about trigger_file. By default it should not be. It is needed so that in the case of a backup you can (having created this file) stop the replication process and make the slave available for recording.

6. Done!
We start the slave. If the slave team
ps aux | grep receiver
Shows something like
postgres 1953 0.0 0.0 101980 4156 ? Ss 19:19 0:00 postgres: wal receiver process streaming 2/B40001D0
(read - there is a postgres process with a description of the wal receiver) - we can assume that everything works.

7. Monitoring
Now a little about sad things - unlike elephants, native replication does not know how to show lag in a human-readable format.
One of the possible ways to monitor replication is to calculate the difference in the log positions on the master / slave:
psql -c "SELECT pg_current_xlog_location()" -h192.168.0.1
--------------------------
0/2000000
(1 row)

psql -c "select pg_last_xlog_replay_location()" -h192.168.0.2
pg_last_xlog_replay_location
------------------------------
0/2000000
(1 row)

From the results obtained, you need to cut off the slash and what is before it, then convert from HEX to a normal number - the result will be some abstract value, the critical threshold of which will be individual for each case.
In a more correct way, I see the creation of a label with a single field of the timestamp type, keep one single record there and update its value once per n-seconds (for example, 30). Then, subtracting from the entry in the master, the contents of the same label on the slave, we get the time lag.

8. If a terrible thing happened
As a rule, replication is needed for two cases: load distribution and in case something happens to the master. So, if your master refused, then your actions will be something like this:
a) Create a trigger file on the slave (described in the 5th paragraph). The slave will open for recording, stopping replication - you can transfer clients to it.
b) Next, when the machine servicing the master returns to the system - we wrap the process backwards - we make a replica from the original master - we perform the 5th point on the master. When it is restored, we delete the trigger file on the slave — everything should return to normal.

PS This text is a free translation of this wiki page.

Source: https://habr.com/ru/post/106872/


All Articles