📜 ⬆️ ⬇️

Half-HA cluster PostgreSQL on Windows 2012

Thanks


Thank you, Habr, for what you are and for those wonderful hours that I spent in you! Thank you, brave habrifiers, for the articles of high quality , subtle humor and a broad outlook. Only thanks to you and Mar. Ivanovna (who taught me to read in the first class) I achieved serious success in IT, and now I want to share a story when non-standard thinking helped solve the idiomatic task from the customer.

Introduction


Some time ago I was working on a proprietary product. This software stored its data in an external DBMS. Initially, work was carried out with MS SQL, but later, specifically for users “I only put * nix-like systems”, they made compatibility with PostgreSQL . Then they took me to solve the problems of the class “the database fell - you need to recover from the transaction logs”.

One day, the team went to implement another project. Initially, it was agreed that the site already has two Windows 2012 servers on which the MS SQL failover cluster will be deployed. However, at the last moment, the customer did not have enough money for instances, and I was instructed to set up a roostge database, as well as somehow solve the fault tolerance task. Thanks to the community of postgres developers who have compiled the installer for Windows. Tasks:

  1. Install PostgreSQL on Windows;
  2. Provide data replication between databases;
  3. Resolve the issue of fault tolerance.

1. Installing PostgreSQL


Everything is simple here:
')

Note: Stack bilder at the end can not run - we will not need its elements.

2. Data Replication


By default, we will install Slony-I, however, asynchronous replication is already in the “box”, which we will use. Sometimes access to the linux command line and the IBs settings files were refused to be issued, and I had to configure the DBMS via psql, so mad skillz would go on. Here and later, all utilities are taken from the C: \ Program Files \ PostgreSQL \ 9.4 \ bin \ folder (it would be correct to specify the $ PGDATA variable, but for some reason it was not set by the installer by default). On the master:

Connect to the DBMS with the command:

psql.exe -U postgres 

Enable data duplication mode:

 alter system set wal_level=hot_standby; 

Disable archiving mode:

 alter system set archive_mode=off; 

Allow to create replication slots:

 alter system set max_replication_slots=1; 

We set the maximum number of data replication processes (the second process will serve the pgadmin utility):

 alter system set max_wal_senders=2; 

Execute the command "\!" and ... Haha! We are in the Windows command line. We edit the pg_hba.conf file (hereinafter, we mean the files from the C: \ Program Files \ PostgreSQL \ 9.4 \ data folder). He is responsible for access of hosts / users to the DBMS. Add lines:

 #        host all all <ip-> md5 #           host all all <ip-slave> md5 host replication postgres <ip-> trust #   postgres        host replication postgres <ip-slave> trust</i><br> 

On the slave:

1. Stop the postgresql service.

2. Delete the contents of the folder C: \ Program Files \ PostgreSQL \ 9.4 \ data

3. We make a backup from the wizard using the pg_basebackup command . As parameters we use:

 pg_basebackup -h <ip-> -D "C:\Program Files\PostgreSQL\9.4\data" -U postgres 

At the end of the utility, it is indignant that WAL archiving is not configured, but we don’t need this in principle because of the next step (as a bonus, saving disk space).

4. Go to the postgresql.conf file and set the parameters:

 #  " " hot_standby=on #     hot_standby_feedback=on #    WAL- wal_receiver_status_interval=0 

We return to the master:

1. Assign a host to the master:

 alter system set hot_standby=off; 

2. Turn off waiting for response from the slave:

 alter system set hot_standby_feedback=off; 

3. Turn off the interval in the transfer of WAL-log:

 alter system set wal_receiver_status_interval=0; 

4. Restart the DBMS service on the wizard (without leaving psql):

 \! pg_ctl restart -D "C:\Program Files\PostgreSQL\9.4\data" 

5. Create a replication slot:

 select pg_create_physical_replication_slot('slot_1'); 

3. Fault tolerance


Slony-I was not part of the project specification, and it was lazy for him to invent PowerShell scripts. The customer completely refused to install an additional linux server, so the option with pgpool-II or pgbouncer was dropped (and it’s not clear how to work with Windows). Therefore, there was a transition to the recording mode on the file-trigger. Let's finish the slave to go to master mode. To do this, you need to create a recovery.conf file and add lines to it:

 standby_mode=on #    primary_conninfo='host=<ip_> port=5432 user=postgres' #     primary_slot_name=slot_1 #     trigger_file=startmaster #  ,          

Note: if the slave enters master mode, the file will change its name to recovery.done .

We start the postgres service on the slave. In case of successful setup, it will start correctly and will drag the data from the wizard. You can check replication:

- On the master using the command:

 select (active) from pg_replication_slots; 

- On the slave, you can try to create an object and get a read-only transaction error. For the client, you can create a test table on the wizard, which is automatically duplicated on the slave.

In the event of a master failure, the potential sysadmin must perform the following sequence of actions:

a) Make sure that the write wizard is unavailable;
b) Check that the ip-address of the database cluster is free. For example, pinganut ip or cut down the network interface wizard;
c) Change the slave ip to the master;
d) Point the slave to work like a master. To do this, you can use pg_ctl promote (an error raised by service privileges has been thrown out on the object) or, since we configured the reconfiguration trigger file, create an empty startmaster file.

In addition (what was not included in the scope of the task): you can make a slave from a fallen master by following the slave configuration items from the second stage.

My boss joked: " Well, you know Sharpe - write! ". For two days the attempt in the field on my knee - I took and gash . About the basics of OOP and thread safety, please write all comments here ).

The program uses the Npgsql library to work with the DBMS. You need to unpack the archive on the slave and run the program for execution.


Next you need to enter:

- ip slave;
- ip wizard;
- ip cluster (the main ip in theory should coincide with the ip-master, but anything can happen);
- the name of the database to which the connection will be made (to create a test table testofcluster in it)

To perform a cluster check, click the Test button. The program will check that it is running on the slave, the wizard allows you to write to the test table and replication occurs in the normal mode. As a bonus , the recovery.conf, postgresql.conf, postgresql.auto.conf files on the slave will be saved, which will simplify the procedure for transferring the master to slave mode.

Start monitoring: the software cyclically checks the availability of the wizard and in the event of its failure - transfers the left to the recording mode and changes the ip from the left to the cluster one.

Specification:
- Must be installed dotnet 4.5
- Compatibility tested on Windows 7 x64, Windows 2012 with PostgreSQL 9.4 (x64).
- The response time to turn off the wizard is from 20 to 35 seconds (a phased check of the availability of the wizard and the main ip is performed).
- The transfer time from the left to the master is less than 5 seconds (restart of the service is not required).
- Note: on virtual machines (on VMware Workstation - 100%), the ip-address does not change.
- Windows should be Russified; In PostgreSQL, the Russian locale must be selected.

Download: download the archive with the compiled program and two libraries can be here .

Source: https://habr.com/ru/post/308950/


All Articles