Parable about synchronous replication and how to get rid of it

Introductory

Suppose we set up asynchronous replication on two hosts with PostgreSQL. What problem can potential users encounter? In my opinion the most obvious is the lag of the replica from the leading master. It doesn't matter if the user (or his client application) opened BEGIN. At this point, the connection may break, and all uncommitted changes will be rolled back.
But a situation is possible when the leading DBMS issues a COMMIT to the user, it drops sharply, and the transaction does not have time to reach the slave. Then, when the replica is restored, this data will fall under the ROLLBACK (since in fact, not the final data, but the entire sequence of actions are transmitted to the slave node), and the user will mistakenly assume that his data has been successfully saved.

Synchronous replication

Let's solve this problem in two ways! Go to PGDATA (the default for Windows is % ProgramFiles% \ PostgreSQL \ 9.4 \ bin ) and run the DBMS connection utility:

psql.exe -h <ip-> -U postgres

Specify the IP or DNS replicas, the record with which will be synchronized. You can use the "*" parameter, since in the case of using physical replication the number of slave hosts will be limited to the slots allocated for them:

 ALTER SYSTEM SET synchronous_standby_names = '10.10.1.2, 10.10.1.3';

Set the master synchronization level:
')

 ALTER SYSTEM SET synchronous_commit = 'on';

All changes will go to the % PGDATA% \ postgresql.auto.conf file and will be transferred to the main postgresql.conf after restarting the DBMS service. But we can not wait to activate the changes and we recalculate the configuration of the built-in function, which returns "t" if successful:

 SELECT pg_reload_conf(); #

Note: The above settings for running replicas cannot be made — thanks to the prudent postgres developer community for these checks. Not all parameters, such as max_replication_slot , can be activated without restarting the DBMS (some of them have to be changed at the source compilation stage)).

Check that replication continues to work on the slaves:

 SELECT (slot_name, active) from pg_replication_slots;

What we got in the end?

Technically, now the COMMIT transaction is first on the replica). There is an opportunity to read data from the slave, unloading the main server. As a bonus, user selects will only make changes on the node that was accessed.

At high loads, the speed of client work will decrease: the delay in the slave node will be affected when you confirm the transaction record. “Losses” are minimized by postgres tuning and competent planning of working with the database. For example, when writing data with low priority, it is possible to set the session parameter:

 SET synchronous_commit = 'local';

The parameter takes the specified value before the end of the connection. The value off , in my opinion, is used only for frank garbage.

Failure of the slave.

In case of failure of the slave node, the master server will wait indefinitely for confirmation of the record of the transaction. There are three ways out of this situation:

To teach users \ application in case of a transaction hang cancel it and put their sessions in local recording mode.
Connect the second slave node to the master by creating a second physical slot for it, since in PostgreSQL it is enough to confirm the record of transactions from a single slave host. If it is necessary to keep within the scheme of two physical machines with linux-like operating systems, then you can put the second slave on the master computer, specifying another port of the service. It will not work to install 2 postgres through the installer on Windows - the installer correctly suggests opening the 5433 port (5432 is already occupied by the first DBMS), creates a unique name for the second service, but starts it with an error.
Write your own program that will track the status of data transfer from the master to the slave host, which I will discuss in the second part of the article.

PostgreSAM

The program is a single-threaded utility that locally checks the status of data replication through the postgres user interface. Because of the simplification of the logic of work, it is enough to specify only the postgres account password before starting.

In case the master loses connection with the slave, the utility turns off the data synchronization option on the master server and recalculates the configuration. The screen displays a message about the changes. When the connection with the slave is restored, you can press the “OK” key - PostgreSAM reconfigures the master server to its original state. The utility also works on the slave: a message about the unavailability of the wizard will be displayed, but no changes will be made to the work of the node. At the work of users, restarting the configuration does not manifest itself in any way: the connection does not terminate, and the "hanging" transactions are successfully completed.
For communication with the DBMS, the Npgsql library is used. The .NET provider for PostgreSQL. For its correct operation, the Mono.Security library is also required - since the program does not require installation, both libraries with an executable file are in the archive . If you plan to use this library in your project, then you need to add it to the assembly and declare:

 using Npgsql;

Now we have the NpgsqlConnection class with a constructor that accepts a string parameter by default:

 "Server=<ip__dns>;User Id=<>;Password=<>;Database=<>;Timeout=<>;"

If you skip the Database, Timeout, the database will be used, identical to the user name, and a timeout of 20 seconds. After that, it becomes possible to call the Open function built into the class - the connection to the database we need will open. The following uses the NpgsqlCommand class:

 NpgsqlCommand <_> = new NpgsqlCommand(<_SQL>, <__NpgsqlConnection>);

In case of a connection establishment error or incorrect execution of the command transmitted to the DBMS, an exception is created, which I recommend to catch with the try-catch combination. The data of the executed queries is read using a bunch of objects of the NpgsqlCommand classes (when creating an object, you must specify the SQL query and connection string), as well as the NpgsqlDataReader:

 NpgsqlDataReader <_> = <__NpgsqlCommand>.ExecuteReader();

For the created variable, the Read () function is provided. The advantages of Npgsql I would consider the provided support for work on an SSL connection, as well as the ability to send requests and receive responses in plain text. Exeptions immediately fall into the minuses - although there is a CheckConnectionState in NpgsqlCommand, anyway, in case of a connection error, an InvalidOperationException is generated.

At the end of the execution of the DBMS commands, you must close the connection with the function of the NpgsqlConnection class - Close ().

Source: https://habr.com/ru/post/309464/

All Articles