📜 ⬆️ ⬇️

Logical replication in PostgreSQL 10

PG Day'17 continues to delight you with copyrighted articles. Today, our old friend and the undisputed author of provocative articles on web development varanio will talk about logical replication.


At first, I wanted to call the article "Harry Potter and the Philosopher's Stone", because for many years when comparing PostgreSQL with MySQL, someone always appeared and noticed that there was no logical replication in the Posgrace (you can only replicate the entire database, and the replica read only) and in MySQL there are two of them: statement based and row based.


And if the statement based is a time bomb with a laser sight in the foot, then row based really really lacked in PG. Those. the question of replication is like a philosopher's stone among lovers of bases.


More precisely, in the case of always it was possible to use slony in order, for example, to replicate only one or two necessary tables. But slony is a tricky part on triggers, which works according to the principle: work - do not touch. Those. for example, it is impossible to simply take and make ALTER TABLE ADD COLUMN, this should be done through special mechanisms. If, nevertheless, someone accidentally did this, and then, even worse, after some time in a panic, returned it as it was, then only a black magician 80lvl can quickly resolve this situation. In addition to slony, starting from 9.4, it became possible to write your extensions for logical replication through wal, like, an example of such an extension is pglogical.


But this is all wrong!


When I found out that the commit that allows you to logically replicate individual tables out of the box, without extensions and plug-ins, fell into the dev-branch of PostgreSQL 10, I decided to see how it works there.


We put PostgreSQL from source on ubuntu


It was not at all difficult. We put all sorts of semi-intelligible words necessary for assembly:


sudo apt-get install avada kedavra expelliarmus
sudo apt-get install -y build-essential libreadline-dev zlib1g-dev flex bison libxml2-dev libxslt-dev libssl-dev


Downloading source:


git clone git://git.postgresql.org/git/postgresql.git


We collect all this business:


 cd postgresql ./configure make make install 

I wrote make install, it irrevocably makes your / usr / local irreparable, so better do it in the virus or docker-container, or ask a real welder how to do it carefully. I am absolutely not admin, so sorry.


Run test daemons


Head over to postgres user. If you don’t have one yet, then create one. If there is, but not to log in under it, then probably just the password is not set, then you need to do sudo passwd postgres. So, let's log in under postgres user:


su - postgres


Create the master and slave folders somewhere and browse the database there:


 /usr/local/pgsql/bin/initdb -D ~/master /usr/local/pgsql/bin/initdb -D ~/slave 

Those. we will have two local pg daemon that will replicate individual tables to each other. Let one work on port 5433, the other on 5434.


To do this, enter the port = 5433 string in ~/master/postgresql.conf , and the port = 5434 string in ~/slave/postgresql.conf , respectively.


In both postgresql.conf configurations, you must specify:


 wal_level = logical 

In addition, in order for replication to work, you need to uncomment the line in pg_hba.conf :


 local replication postgres trust 

We launch both daemons:


 /usr/local/pgsql/bin/pg_ctl start -D ~/master -l ~/master.log /usr/local/pgsql/bin/pg_ctl start -D ~/slave -l ~/slave.log 

Configuring Replication


Unfortunately, there is no (generally no) documentation yet. So I had to go over the tests a bit to the source code in order to learn at least something about how to configure and use this type of replication.


We will do everything right in the built-in postgres database, so as not to clutter up the details. Go to the master:


 /usr/local/pgsql/bin/psql -p 5433 

We will create a table and a "publication":


 CREATE TABLE repl ( id int, name text, primary key(id) ); CREATE PUBLICATION testpub; 

Add to the publication all the necessary tables (in this case - one):


 ALTER PUBLICATION testpub ADD TABLE repl; 

Now on the side of the slave:


 /usr/local/pgsql/bin/psql -p 5434 

Also create a table:


 CREATE TABLE repl ( id int, name text, primary key(id) ); 

Now you need to create a subscription to the publication, in which we specify the connection string to another database and the name PUBLICATION:


 CREATE SUBSCRIPTION testsub CONNECTION 'port=5433 dbname=postgres' PUBLICATION testpub; 

Check


Paste on the master:


 INSERT INTO repl (id, name) VALUES (1, ''); 

We read on the replica:


 postgres=# select * from repl; id | name ----+------ 1 |  (1 row) 

It works!


Now stop the replica:


 /usr/local/pgsql/bin/pg_ctl stop -D ~/slave 

On the master we will do:


 delete from repl; insert into repl (id, name) values (10, 'test'); 

We start the slave and check:


 /usr/local/pgsql/bin/pg_ctl start -D ~/slave -l ~/slave.log 

 postgres=# select * from repl; id | name ----+------ 10 | test (1 row) 

Everything worked.


Moreover, I added a new column to the master and to the slave, inserted records, and this also worked.


If you want to learn more examples of use, look at the source file src/test/subscription/t/001_rep_changes.pl . He is on a pearl, but everything is clear there.


src / test / subscription / t / 001_rep_changes.pl
 # Basic logical replication test use strict; use warnings; use PostgresNode; use TestLib; use Test::More tests => 11; # Initialize publisher node my $node_publisher = get_new_node('publisher'); $node_publisher->init(allows_streaming => 'logical'); $node_publisher->start; # Create subscriber node my $node_subscriber = get_new_node('subscriber'); $node_subscriber->init(allows_streaming => 'logical'); $node_subscriber->start; # Create some preexisting content on publisher $node_publisher->safe_psql('postgres', "CREATE TABLE tab_notrep AS SELECT generate_series(1,10) AS a"); $node_publisher->safe_psql('postgres', "CREATE TABLE tab_ins (a int)"); $node_publisher->safe_psql('postgres', "CREATE TABLE tab_full AS SELECT generate_series(1,10) AS a"); $node_publisher->safe_psql('postgres', "CREATE TABLE tab_rep (a int primary key)"); # Setup structure on subscriber $node_subscriber->safe_psql('postgres', "CREATE TABLE tab_notrep (a int)"); $node_subscriber->safe_psql('postgres', "CREATE TABLE tab_ins (a int)"); $node_subscriber->safe_psql('postgres', "CREATE TABLE tab_full (a int)"); $node_subscriber->safe_psql('postgres', "CREATE TABLE tab_rep (a int primary key)"); # Setup logical replication my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres'; $node_publisher->safe_psql('postgres', "CREATE PUBLICATION tap_pub"); $node_publisher->safe_psql('postgres', "CREATE PUBLICATION tap_pub_ins_only WITH (nopublish delete, nopublish update)"); $node_publisher->safe_psql('postgres', "ALTER PUBLICATION tap_pub ADD TABLE tab_rep, tab_full"); $node_publisher->safe_psql('postgres', "ALTER PUBLICATION tap_pub_ins_only ADD TABLE tab_ins"); my $appname = 'tap_sub'; $node_subscriber->safe_psql('postgres', "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub, tap_pub_ins_only"); # Wait for subscriber to finish initialization my $caughtup_query = "SELECT pg_current_xlog_location() <= replay_location FROM pg_stat_replication WHERE application_name = '$appname';"; $node_publisher->poll_query_until('postgres', $caughtup_query) or die "Timed out while waiting for subscriber to catch up"; my $result = $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM tab_notrep"); is($result, qq(0), 'check non-replicated table is empty on subscriber'); $node_publisher->safe_psql('postgres', "INSERT INTO tab_ins SELECT generate_series(1,50)"); $node_publisher->safe_psql('postgres', "DELETE FROM tab_ins WHERE a > 20"); $node_publisher->safe_psql('postgres', "UPDATE tab_ins SET a = -a"); $node_publisher->safe_psql('postgres', "INSERT INTO tab_rep SELECT generate_series(1,50)"); $node_publisher->safe_psql('postgres', "DELETE FROM tab_rep WHERE a > 20"); $node_publisher->safe_psql('postgres', "UPDATE tab_rep SET a = -a"); $node_publisher->poll_query_until('postgres', $caughtup_query) or die "Timed out while waiting for subscriber to catch up"; $result = $node_subscriber->safe_psql('postgres', "SELECT count(*), min(a), max(a) FROM tab_ins"); is($result, qq(50|1|50), 'check replicated inserts on subscriber'); $result = $node_subscriber->safe_psql('postgres', "SELECT count(*), min(a), max(a) FROM tab_rep"); is($result, qq(20|-20|-1), 'check replicated changes on subscriber'); # insert some duplicate rows $node_publisher->safe_psql('postgres', "INSERT INTO tab_full SELECT generate_series(1,10)"); # add REPLICA IDENTITY FULL so we can update $node_publisher->safe_psql('postgres', "ALTER TABLE tab_full REPLICA IDENTITY FULL"); $node_subscriber->safe_psql('postgres', "ALTER TABLE tab_full REPLICA IDENTITY FULL"); $node_publisher->safe_psql('postgres', "ALTER TABLE tab_ins REPLICA IDENTITY FULL"); $node_subscriber->safe_psql('postgres', "ALTER TABLE tab_ins REPLICA IDENTITY FULL"); # and do the update $node_publisher->safe_psql('postgres', "UPDATE tab_full SET a = a * a"); # Wait for subscription to catch up $node_publisher->poll_query_until('postgres', $caughtup_query) or die "Timed out while waiting for subscriber to catch up"; $result = $node_subscriber->safe_psql('postgres', "SELECT count(*), min(a), max(a) FROM tab_full"); is($result, qq(10|1|100), 'update works with REPLICA IDENTITY FULL and duplicate tuples'); # check that change of connection string and/or publication list causes # restart of subscription workers. Not all of these are registered as tests # as we need to poll for a change but the test suite will fail none the less # when something goes wrong. my $oldpid = $node_publisher->safe_psql('postgres', "SELECT pid FROM pg_stat_replication WHERE application_name = '$appname';"); $node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION tap_sub CONNECTION 'application_name=$appname $publisher_connstr'"); $node_publisher->poll_query_until('postgres', "SELECT pid != $oldpid FROM pg_stat_replication WHERE application_name = '$appname';") or die "Timed out while waiting for apply to restart"; $oldpid = $node_publisher->safe_psql('postgres', "SELECT pid FROM pg_stat_replication WHERE application_name = '$appname';"); $node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION tap_sub SET PUBLICATION tap_pub_ins_only"); $node_publisher->poll_query_until('postgres', "SELECT pid != $oldpid FROM pg_stat_replication WHERE application_name = '$appname';") or die "Timed out while waiting for apply to restart"; $node_publisher->safe_psql('postgres', "INSERT INTO tab_ins SELECT generate_series(1001,1100)"); $node_publisher->safe_psql('postgres', "DELETE FROM tab_rep"); $node_publisher->poll_query_until('postgres', $caughtup_query) or die "Timed out while waiting for subscriber to catch up"; $result = $node_subscriber->safe_psql('postgres', "SELECT count(*), min(a), max(a) FROM tab_ins"); is($result, qq(150|1|1100), 'check replicated inserts after subscription publication change'); $result = $node_subscriber->safe_psql('postgres', "SELECT count(*), min(a), max(a) FROM tab_rep"); is($result, qq(20|-20|-1), 'check changes skipped after subscription publication change'); # check alter publication (relcache invalidation etc) $node_publisher->safe_psql('postgres', "ALTER PUBLICATION tap_pub_ins_only WITH (publish delete)"); $node_publisher->safe_psql('postgres', "ALTER PUBLICATION tap_pub_ins_only ADD TABLE tab_full"); $node_publisher->safe_psql('postgres', "DELETE FROM tab_ins WHERE a > 0"); $node_publisher->safe_psql('postgres', "INSERT INTO tab_full VALUES(0)"); $node_publisher->poll_query_until('postgres', $caughtup_query) or die "Timed out while waiting for subscriber to catch up"; # note that data are different on provider and subscriber $result = $node_subscriber->safe_psql('postgres', "SELECT count(*), min(a), max(a) FROM tab_ins"); is($result, qq(50|1|50), 'check replicated deletes after alter publication'); $result = $node_subscriber->safe_psql('postgres', "SELECT count(*), min(a), max(a) FROM tab_full"); is($result, qq(11|0|100), 'check replicated insert after alter publication'); # check all the cleanup $node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub"); $result = $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM pg_subscription"); is($result, qq(0), 'check subscription was dropped on subscriber'); $result = $node_publisher->safe_psql('postgres', "SELECT count(*) FROM pg_replication_slots"); is($result, qq(0), 'check replication slot was dropped on publisher'); $result = $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM pg_replication_origin"); is($result, qq(0), 'check replication origin was dropped on subscriber'); $node_subscriber->stop('fast'); $node_publisher->stop('fast'); 

In particular, if you create a table without a primary key, then to remove values ​​from it, you need to write:


 ALTER TABLE tablename REPLICA IDENTITY FULL 

I do not know how it works, apparently some id on the fly. If someone has more information on logical replication, share pliz in comments.


findings


The conclusion is very simple: I'm really looking forward to PostgreSQL 10 in the production-ready state, since this will solve a whole layer of organizational problems (you can throw out slony). For someone, perhaps this will be the last straw for the transition from MySQL to Postgres.


On the other hand, how it will work in practice, so far no one knows. Will it be fast enough, convenient to maintain and so on. If someone has more information on the topic, share pliz in comments.


In the meantime, we are waiting for PostgreSQL 10, for sure, you have a lot of questions on the training of current replication methods. At PG Day'17 you will find a large number of interesting reports and workshops on PostgreSQL. For example, Ilya Kosmodemyansky will tell all about GHG tuning, transaction processing, auto-vacuum and, of course, will tell you how to avoid common mistakes. Hurry up to register !


')

Source: https://habr.com/ru/post/326996/


All Articles