Failover Master-Slave Cluster on PostgreSQL

Greetings, habrozhiteli!
In this article I want to share the experience of deploying a Master-slave cluster on a PostgreSQL DBMS. Fault tolerance is achieved using the pgpool-II (failover, online recovery) features.
pgpool is an excellent tool for scaling and load distribution between servers and, I think, few people know about the automatic creation of failover on the slave server when the master fails and how to add new capacity to an already running cluster without shutting down the entire cluster.

Cluster layout and machine requirements

The figure shows a typical master-slave cluster scheme.

A cluster must contain 1 master server (Master), at least 1 slave (Slave), 1 scaling node (Balancer).
When each of the servers must have a Linux distribution installed (I have Red Hat 6.1 installed), the gcc compiler must be installed on the scaling node.
PostgreSQL version 9.0.1, pgpool-II 3.0.5. You can use other versions of the DBMS and pgpool. In this case, refer to the documentation.

Setting up a remote connection between cluster servers

Online recovery and failover require setting up a remote connection via SSH without a password. To do this, you need to create the SSH keys of the postgres user and distribute them to the postgres users on each of the servers.
An important point! For online recovery, it is necessary that when opening a remote session, you can switch to another remote session (i.e., you can implement the following SSH transition mechanism without a password: the scaling node — the master server — the slave server and the scaling node — the slave server — the master server).
For failover, you need to create an SSH key of the root on the scaling node and send the master and slave servers to postgres users.
This step is important when setting up, so make sure that you can connect from a remote session of one of the servers to another.
')

Configure Streaming Replication

First you need to open the reception / transmission of data on port 5432 (standard PostgreSQL port) in iptables.
Edit the master server's $PGDATA/postgresql.conf configuration file as follows:

 listen_addresses = '*' wal_level = hot_standby max_wal_senders = 2 wal_keep_segments = 32 #hot_standby = on

I note the importance of the last line. The fact is that it will be used in the recovery script of the slave node, so it must be changed as described above.
Next, add lines for replication in $PGDATA/pg_hba.conf :

 host replication postgres 192.168.100.2/32 trust host replication postgres 192.168.100.3/32 trust

postgres is a database administrator who will perform replication and other admin tricks. With these lines we allowed replication of both the slave and the master server.
Then we overload the leading server:

 # service postgresql restart

Stop the slave server (if it was started earlier):

 # service postgresql stop

Now you can begin to replicate.
On the master server by the postgres user, we create a backup database and send it to the slave server:

 $ psql -c "SELECT pg_start_backup('stream');" $ rsync -a /var/lib/pgsql/data/ 192.168.100.3:/var/lib/pgsql/data/ --exclude postmaster.pid $ psql -c "SELECT pg_stop_backup();"

After that, on the slave we create the replication config $PGDATA/recovery.conf :

 standby_mode = 'on' primary_conninfo = 'host=192.168.100.2 port=5432 user=postgres' trigger_file = 'failover'

The trigger_file parameter is responsible for the way in which PostgreSQL searches for a file in order to switch to master mode. In this case, PostgreSQL searches for a file along the path $PGDATA/failover .
Next, you need to enable the "hot spare" on the slave server:

 $ sed -i 's/#hot_standby = on/hot_standby = on/' /var/lib/pgsql/data/postgresql.conf

Then you need to start the slave server:

 # service postgresql start

Replication activity can be verified as follows:
Run the command on the master server

 $ ps aux | grep sender

It should display approximately the following:

 2561 ? Ss 0:00 postgres: wal sender process postgres 192.168.100.3(33341) streaming 0/2031D28

Similarly on the slave server:

 $ ps aux | grep receiver

It will issue the following:

 1524 ? Ss 0:00 postgres: wal reciever process streaming 0/2031D28

General setting of the scaling node

Change the configuration file /etc/pgpool-II/pgpool.conf :

 #      listen_addresses = '*' #        backend_hostname0 = '192.168.100.3' backend_port0 = 5432 backend_weight0 = 1 backend_data_directory0 = '/var/lib/pgsql/data' #        backend_hostname1 = '192.168.100.2' backend_port1 = 5432 backend_weight1 = 1 backend_data_directory1 = '/var/lib/pgsql/data' #  pool_hba.conf    enable_pool_hba = true

Next, in /etc/pgpool-II/pool_hba.conf add information about client authorization:

 host all all 127.0.0.1/32 trust host all all 192.168.100.2/32 trust host all all 192.168.100.3/32 trust

Reboot pgpool:

 # service pgpool restart

Setup automatic failover

The mechanism for creating an automatic failover is as follows:

On the worker (master and slave) servers, the pgpool-walrecrunning() procedure is performed, which determines which server is the master and which slave.
pgpool connects remotely to production servers and checks DBMS process activity. If not, pgpool invokes a script that creates a failover on the slave node in the event of a master server failure.
After that, pgpool disconnects from the fallen node and restarts all client applications connected to it.

And now setting:
On the scaling node we change the config pgpool /etc/pgpool-II/pgpool.conf :

 # ,     failover_command = '/etc/pgpool-II/failover.sh %d %H /var/lib/pgsql/data/failover' # ,    health_check_user = 'postgres' #    "-" master_slave_mode = true #     ,    '' master_slave_sub_mode = 'stream' #   pgpool      ,        replication_mode = false #   ,        load_balance_mode = true

I'll tell you a little more about the parameter failover_command . The parameters specified in this line are passed to the parameters %d - the identifier of the fallen node (according to backend_hostname in pgpool.conf ), %H - the IP of the new master server.
Actually the script itself failover.sh :

 #! /bin/bash # ID   FAILED_NODE=$1 # IP   NEW_MASTER=$2 #     TRIGGER_FILE=$3 if [ $FAILED_NODE = 1 ]; then echo "    " exit 1 fi echo "    " echo "  : $NEW_MASTER" ssh -T postgres@$NEW_MASTER touch $TRIGGER_FILE exit 0

You need to create this script in the pgpool /etc/pgpool-II/ directory and issue permissions 755.
Now we need to compile the pgpool procedures. The src package pgpool in the sql/pgpool-walrecrunning contains the source code for the procedure we need. To compile it, you need PostgreSQL header files, after which you can use the make command and get pgpool-walrecrunning.so and the SQL download request for this procedure pgpool-walrecrunning.sql .
The procedure needs to be copied to the directory on each working server /usr/lib64/pgsql/ , which is called $libdir , the sql file in /usr/share/pgsql/ .
We load into the database on the master server:

 psql -f /usr/share/pgsql/pgpool-walrecrunning.sql -d postgres

There is no need to upload this procedure to the database on the slave server: it will be available due to the replication configured earlier.
That's all.

Server status can be determined using the query

 SHOW pool_nodes;

having previously logged into the psql client on the scaling node.
Sample query output:

  hostname | port | status | lb_weight ----------------------------------------------------- 192.168.100.3 | 5432 | 2 | 0.500000 192.168.100.2 | 5432 | 2 | 0.500000 (2 rows)

Server status 2 means that the server is active and available for requests. If one of the servers fails, the status will change to 3.

You can test the automatic failover mechanism as follows:

Disable master server
Run the query SHOW pool_nodes; on the zoom node
View pgpool logs for script execution.
Ensure that the slave server can accept write requests after the script is executed.

Online recovery

Probably, this mechanism is the most difficult in terms of debug, but it is also a powerful tool when administering the database. The operation of this mechanism is as follows: there is a working cluster, we want to turn on the slave server that fell earlier, but the data stored on it does not correspond to the data in the cluster. This mechanism allows us to add another slave server in real time without stopping the cluster and carrying out any additional actions during its configuration.
Online recovery works as follows:

The scaling node starts the slave server recovery procedure.
This procedure on the master server runs a script that performs automatic replication between the master and the slave server.
After successful replication, the database on the master server is remotely started using the standard PostgreSQL PGCTL utility PGCTL
pgpool restarts, detects the slave server and includes it in the cluster

Go to the setting.
Add the following lines to /etc/pgpool-II/pgpool.conf :

 # ,   recovery_user = 'postgres' #    recovery_password = '123456' # ,       $PGDATA recovery_1st_stage_command = 'basebackup.sh'

Add postgres password hash:

 # pg_md5 123456 >> /etc/pgpool-II/pcp.conf

123456 is a clear postgres password. In addition, in addition to the password hash, you need to specify the user name, to whom this hash belongs, The file must contain the string postgres:enrypted_password .
On the master node create a basebackup.sh script basebackup.sh following content:

 #!/bin/bash #    $PGDATA    PRIMARY_DATA=$1 # IP-  ,     SLAVE_IP=$2 #    $PGDATA    SLAVE_DATA=$3 #  IP       recovery.conf PRIMARY_IP=$(ifconfig eth0| sed -n '2 {s/^.*inet addr:\([0-9.]*\) .*/\1/;p}') #        TMP_DIR=/var/lib/pgsql/tmp #         (    - ) cd $PRIMARY_DATA rm -f recovery.* failover # ,         cat postgresql.conf | grep '#hot_standby = on' #  ,    if [ $? = 1 ] then sed -i 's/hot_standby = on/#hot_standby = on/' postgresql.conf #    /usr/bin/pg_ctl restart -D $PGDIR fi #    ssh -T postgres@$SLAVE_IP "/usr/bin/pg_ctl stop -D $SLAVE_DATA" #  backup     psql -c "SELECT pg_start_backup('Streaming Replication', true)" postgres #    rsync -a $PRIMARY_DATA/ $SLAVE_IP:$SLAVE_DATA/ --exclude postmaster.pid --exclude postmaster.opts #        mkdir $TMP_DIR cd $TMP_DIR #   postgresql.conf   hot_standby cp $PRIMARY_DATA/postgresql.conf $TMP_DIR/ sed -i 's/#hot_standby = on/hot_standby = on/' postgresql.conf #   recovery.conf echo "standby_mode = 'on'" > recovery.conf echo "primary_conninfo = 'host=$PRIMARY_IP port=5432 user=postgres'" >> recovery.conf echo "trigger_file = 'failover'" >> recovery.conf #        ssh -T postgres@$SLAVE_IP rm -f $SLAVE_DATA/recovery.* #    scp postgresql.conf postgres@$SLAVE_IP:$SLAVE_DATA/postgresql.conf scp recovery.conf postgres@$SLAVE_IP:$SLAVE_DATA/recovery.conf #  backup psql -c "SELECT pg_stop_backup()" postgres #      cd .. rm -fr $TMP_DIR

I emphasize that this script should be in the $PGDATA directory. Script assign rights to 755.
On the slave and master server in the $PGDATA directory create a script pgpool_remote_start ( It is with this name! ) With the following content:

 #! /bin/bash if [ $# -ne 2 ] then echo " ,  " exit 1 fi SLAVE_IP=$1 SLAVE_DIR=$2 PGCTL=/usr/bin/pg_ctl ssh -T $SLAVE_IP $PGCTL -w -D $SLAVE_DIR start 2>/dev/null 1>/dev/null < /dev/null &

It will allow you to remotely start DBMS processes.
Next, on the scaling node, you need to compile the pgpool-recovery.so stored procedure located along the sql/pgpool-recovery src path of the pgpool package. Similarly, send it to the working servers and download the procedure to the database:

 $ psql -f /usr/share/pgsql/pgpool-recovery.sql -d template1

This completes the online recovery setting.

To enable a new slave server in a cluster, you must perform the following steps:

Run base on new master node
On the scaling node, execute the server recovery command: pcp_recovery_node 20 192.168.100.4 9898 postgres 123456 1

Learn more about pcp_recovery_node . This command implements server recovery to cluster. 20 is the number of attempts to connect to the slave server, 192.168.100.4 is the IP node of the scaling node, 9898 is the 9898 port of the scaling node commands, postgres is the name of the recovery user, 123456 is its password, 1 is the ID of the node being restored.
This completes the online recovery setup.

You can test these two mechanisms according to the following plan:

Create a test database on the master server. Make sure it replicates to the slave.
Simulate master server failure by disabling it
Make sure that the failover worked and the slave server became the new master.
Make changes to the database on the slave server
Start the fallen master server and make it slave by performing online recovery

Thus, the mechanisms described above make it possible to secure the Master-Slave cluster and simplify the work of the database administrator during its restoration.

PS I hope this post has helped someone. Comments and additions are welcome! Thank you for attention.

Source: https://habr.com/ru/post/188096/

All Articles