OpenSCG's HA failover solution “PGHA”. Master-Slave Solution for PostgreSQL 9.3

Below will be described the experience of setting up a Master-Slave fault-tolerant system using PostgresSQL's own resources, Asynchronous Replication + pgBouncer + PGHA, in a free translation format with comments from one post and installation guide.

pgHA

pgHA is a program written in Perl, the code of which is presented in the file failoverd.pl, which is the basis of the tool. Uses pgbouncer and PostgresSQL own replication in its work; monitors the leading database and in case of its inaccessibility opens the auxiliary database for recording, stopping replication. At least since version 9.3 is installed along with PostgresSQL, you can download it here . The installation process is described in the README file , which is in ./9.3/pgha/doc/.

On habrahabr, apparently, the most popular tool for the organization of resiliency is pgpool. Unfortunately, the existing individual experience does not allow to compare these two tools.
')

Customization

This system requires three machines: a master server (master), an auxiliary (slave) and a scaling node. You can do without the latter.

1. Configure replication

About setting up native replication has already been written in very many articles, under this pretext I just refer to some:

2. Configure pgBouncer

pgBouncer is an easy puller, connection manager for PostgreSQL. It is used as an input access point that redirects requests to the main database or auxiliary depending on the settings file. Installs with PostgresSQL 9.3.

You need to start the configuration by creating a bouncer user and a password to it on the server — the scaling node, or on any other machine that will monitor the status of the master server.
Next, prepare 2 files pgbouncer.ini.orig and pgbouncer.ini.failover. These are the pgbouncer configuration files and, in fact, differ only in the IP hosts of the database: the IP file and the port of the master database are specified in the .orig file, and the secondary one in the .failover file. Examples can be found here .
- you need to create an empty log file and specify the path to it
- still need to create a userlist.txt file with users under which requests will be made via pgbouncer and do not forget to specify the path to it (the auth_file field), an example can be found in ./9.3/share/doc/pgbouncer or here
- select from created users who will have admin rights in relation to the pgbouncer and specify them in the admin_users field
- you also need to specify the port on which pgbouncer (listen_port) will be located and further, after having familiarized yourself with other filled fields and having made any changes you want, you can refer to pgbouncer as usual DB ... after launch.
Create a pgbouncer.ini file and copy the contents of the pgbouncer.ini.orig file into it. This is a working file and pgbouncer.ini.orig or pgbouncer.ini.failover files will be copied to it later, depending on the situation. Create an empty pgbouncer.ini_old file, it will be used by pgha.
Run pgbouncer on behalf of bouncer:
pgbouncer -d //pgbouncer.ini

3. Configure pgHA

pgHA uses some Perl modules that need to be pre-installed. Before that, you need to install CPAN:

 yum install perl-CPAN perl -MCPAN -e 'install Module::Build::Compat' perl -MCPAN -e 'install Config::IniFiles' perl -MCPAN -e 'install DBI' perl -MCPAN -e 'install DBD::Pg' perl -MCPAN -e 'install Log::Log4perl' perl -MCPAN -e 'install Proc::Daemon' perl -MCPAN -e 'install Net::Ping' perl -MCPAN -e 'install Nagios::Plugin'

A harmless error may occur: yaml 'not installed persistent state. If annoying, you can do:

 cpan install Bundle::CPAN reload cpan exit

Next, you should prepare a configuration file for pgHA, examples of which can be found in the folder ./9.3/pgHA/cfg/ or here : the example.conf and scottmVm.conf files, it is better to focus on the second:
- pidFile can lie anywhere, you need to make sure that the selected path is available.
- logConfig is located in ./9.3/pgHA/cfg/log4perl.conf and contains logging settings
- for triggerFile, you need to specify the file specified in the recovery.conf
- the [failover] section is better to remove, just below it will be mentioned about it
- the [app01] section contains the pgbouncer settings, including the commands at the end of the file that, in case the master database fails, delete all connections to it and then restart the pgbouncer to work with the new main database:
```
 # Which databases should be part of failover fence_lock_dbs=postgres # pgBouncer command to pause / fence the db's fence_lock_command=kill # pgBouncer command to reload the config file fence_move_command=reload # pgBouncer command to unlock the db's fence_unlock_command=resume dbcheck="select 1" 
```
- in the [app01] section in the pgbouncer_db_user field there should be one of the admin_users specified in pgbouncer.ini
- should be familiar with the other parameters
Then create an empty log file pgHA.log in the / opt / pgHA / log / folder. This is the path that pgHA uses by default.
On the master server, create the table described in the file ./9.3/pgha/cfg/status_table.sql, or here in the database specified in the dbname field in the [master] section.
Generate ssh user key on behalf of which pgHA will be launched and send it to bouncer user:
- ssh-keygen -t rsa (create an empty password)
- copy the contents of ~ / .ssh / id_rsa.pub
- on the machine with the bouncer user, on its behalf, execute the commands:
mkdir ~/.ssh
chmod 0700 .ssh
vi authorized_keys (paste previously copied contents of id_rsa.pub, close with saving)
chmod 0600 .ssh/authorized_keys
Run pgHA.
./failoverd.pl -c //PgHA.conf --auto
The failoverd.pl file is in the ./9.3/pgHA/bin folder.
If step 6 passed without errors, stop pgHA
./failoverd.pl -c //PgHA.conf --stop
In this state, the pgHA fails automatically. You need to open the file failoverd.pl, comment out the line 443 and comment out the 442 according to the code presented here . It should work:
```
 # Start the failoverd monitoring process elsif ( $startWorker ) { print "\n\tInitializing... \n"; . . . # Once we have verified that the system is online, we need # to set the failover 'spring'. $logger->info("==Startup==: Arming Failover mechanism"); print "\t === Arming Failover mechanism === \n"; if ( SetSpring(%Config,$logger) != 1 ) # if ( 0 ) { $logger->info("Cannot set failover configuration, exiting..."); print "Cannot set failover configuration, exiting...\n"; exit(1); } . . . 
```
Now, if the master database fails, pgHA stops replication, opens the auxiliary database for writing, restarts pgbouncer with settings for robots with the new main database and completes its work. To return the system to its original state is required manually.
Run pgHA. To test.

PS

The [failover] section indicates the commands that are executed during the failover. Thus, it is possible, for example, not to change the code in failoverd.pl, but to create a script - which changes the pgbouncer settings file and reloads it - and point to it.

This solution does not reduce the speed of the programs using the database, and with good tuning, the pgbouncer can increase it. I also consider the openness of the pgHA code a plus - it can be supplemented to teach, for example, attempts to reload the master database before failover, resume replication, change the font in the terminal, etc.

The postgres 8.4 solution is presented in the aforementioned blog .

Source: https://habr.com/ru/post/223825/

All Articles