Bucardo: Multimaster replication

In the process of tormenting, I shoveled a ton of articles and decided to write a detailed manual. Moreover, the information on configuring multimaster and in Russian is very small and it is kind of piecewise.

A bit of introductory. To Bucardo earned, we must:

1) Tell him which databases on which servers exist at all.
')
2) Tell him which tables are involved in replication.

Attention: if the developers add a new table to the application, we must report this to bucardo. The same goes for changing the schema of existing tables.

3) Tell him which groups of tables exist and which tables fall into which groups. Groups are needed in case different tables need to be replicated between different servers. It is more convenient to work with a group than to indicate each one separately (very similar to groups in Nagios).

4) Tell him which database groups exist. The goal is the same as for the tables.

Let's proceed to the installation. Option for Debian 7. It is understood that the postgresql-9.1 and postgresql-client-9.1 packages are already installed.

Preliminary preparation

Servers will be called node1 and node2 . Be sure to also ensure that all participating PostreSQL servers listen to external interfaces:

# netstat -plnt4 | grep 5432 tcp 0 0 0.0.0.0:5432 0.0.0.0:* LISTEN 12345/postgres

Install the Bucardo package and PL / Perl support for PostgreSQL on each of the servers:

 # apt install bucardo postgresql-plperl-9.1

Activate on each of the servers:

 # sed -i 's/ENABLED=0/ENABLED=1/' /etc/default/bucardo

For some reason, the main managers of the package did not guess to create a directory under the PID, so we will create it ourselves on each of the servers:

 # mkdir /var/run/bucardo

Make sure that we can connect via a TCP socket to the DBMS on each of the servers:

 # psql -U postgres -h 127.0.0.1

If you do not remember the password, the simplest instruction is here .

If the PG does not want to accept requests from a particular user’s address, then configure /etc/postgresql/9.1/main/pg_hba.conf

Next will be the initialization of the database. It will be created by the postgres user, but filled with the user bucardo, so you can rest against the connectivity issue.
In order to avoid it, we will pre-add a line for it to /etc/postgresql/9.1/main/pg_hba.conf. In addition, in the process of work, Bucardo will turn not only to its cluster node, but also to the steam room. Therefore, we also do not forget it. If you have more servers in a cluster, do not forget about them. On each of the servers:

 host all bucardo 127.0.0.1/32 trust host all bucardo SECOND.NODE.IP.ADDRESS/32 password

After this, restarting the DBMS:

 # pg_ctlcluster 9.1 main restart

Install Bucardo

The bucardo_ctl utility in recent versions of Debian has been replaced by bucardo, so we will use it.

Initialize the database:

 # bucardo install

The dialogue looks like this:

 # bucardo install This will install the bucardo database into an existing Postgres cluster. Postgres must have been compiled with Perl support, and you must connect as a superuser Current connection settings: 1. Host: localhost 2. Port: 5432 3. User: postgres 4. Database: postgres 5. PID directory: /var/run/bucardo Enter a number to change it, P to proceed, or Q to quit: P Password for user postgres: Postgres version is: 9.1 Password for user postgres: Creating superuser 'bucardo' Password for user postgres: Attempting to create and populate the bucardo database and schema Password for user postgres: Database creation is complete Updated configuration setting "piddir" Installation is now complete. If you see errors or need help, please email bucardo-general@bucardo.org You may want to check over the configuration variables next, by running: bucardo show all Change any setting by using: bucardo set foo=bar

During the initialization process, the database was created from the /usr/share/bucardo/bucardo.schema file, so there is no need to fill it in with hands, as described in the manuals of previous versions.

Bucardo is installed, you can run it:

 # bucardo start

Replication setup

Before configuring replication, let's create test databases that will be replicated.
On each of the servers:

 # psql -U postgres -c "CREATE DATABASE mydb;" # psql -U postgres mydb -c "CREATE TABLE mytable ( num123 integer PRIMARY KEY, abc varchar(10) );"

Another important point about security . After adding the replicable database to the configuration, Bucardo will enter the user's password into the database. And since he did not request it during installation, he made it exactly the same as that of the postgres user. In other words, in our bucardo database, the password from the superuser will be stored in clear text, which is somewhat dangerous.

Therefore, we will make him a different password. On each of the servers:

 # psql -U postgres -c "ALTER USER bucardo WITH PASSWORD 'eiP4uSash5';"

Next, we give Bucardo information on how to connect to the database, which we will replicate. I am not a fan of Unix sockets under high load conditions (a separate topic for conversation), so even where locally, we specify a TCP socket.

ATTENTION: We do this on the node1 server. In general, we continue to work only with node1 until it is specified what should be done on both.

Add a local (mydb_node1) and its remote copy (mydb_node2) from the node2 server:

 # bucardo add database mydb_node1 dbname=mydb dbhost=127.0.0.1 dbuser=bucardo dbpass=eiP4uSash5 Added database "mydb_node1"

 # bucardo add database mydb_node2 dbname=mydb dbhost=node2.example.com dbuser=bucardo dbpass=eiP4uSash5 Added database "mydb_node2"

Here:

mydb_nodeX is an internal base notation. This is the name Bucardo uses in interior work with the base.
dbname = mydb is the real database name in PostgreSQL, referenced by mydb_nodeX.
dbuser = bucardo - under whom Bucardo will connect to the database to work with this database.

We can see the result like this:

 # bucardo list database Database: mydb_node1 Status: active Conn: psql -p -U bucardo -d mydb -h 127.0.0.1 Database: mydb_node2 Status: active Conn: psql -p -U bucardo -d mydb -h node2.example.com

These settings are taken from the db table of the base bucardo, where the password mentioned above sits:

 # psql -U postgres bucardo -c "SELECT name,dbname,dbhost,dbuser,dbpass,status FROM db;" name | dbname | dbhost | dbuser | dbpass | status ------------+--------+-------------------+---------+------------+-------- mydb_node1 | mydb | 127.0.0.1 | bucardo | eiP4uSash5 | active mydb_node2 | mydb | node2b.forbet.net | bucardo | eiP4uSash5 | active (2 rows)

Now we need to add a table that we will replicate between them. In most cases, people replicate the entire database, so we will add everything at once (a group of tables (herd) will be created automatically). If the developers come up with a new table, we will simply add it to the group later and everything will work by itself - as further settings will concern the entire group.

 # bucardo add table all --db=mydb_node1 --herd=mydb_herd Creating herd: mydb_herd Added table public.mytable to herd mydb_herd New tables added: 1

Here:

--herd = mydb_herd - the name of the table group, in order to set up synchronization not for each one separately, but for the whole crowd.

And we can immediately see it:

 # bucardo list tables 1. Table: public.mytable DB: mydb_node1 PK: num123 (int4)      PK. Bucardo, ,       .      .

And the group is also visible:

 # bucardo list herd Herd: mydb_herd DB: mydb_node1 Members: public.mytable

The same goes for sequences. In our example, they are not, but suddenly someone uses. We will not create our own group for them in order not to complicate things. The likelihood that the tables are replicated in one direction and the sequences in the other is extremely small. Therefore, let there be one group for tables and sequences:

 # bucardo add sequence all --db=mydb_node1 --herd=mydb_herd Sorry, no sequences were found New sequences added: 0

Our next task is to create a replication group. In this group we will say which database will be the source and which will be the recipient of the data. First create the group itself, while the empty one:

 # bucardo add dbgoup other_mydb_servers Created database group "mydb_servers_group"

We add both our servers to the group, indicating who will play what role. This is the only point where the master-slave setting differs from the master-master .

Initially, you might think that source is the source, and target is the recipient. In fact this is not true. A source is one who works both as a source and as a recipient, and the target is only a recipient.

That is, if we have a master-slave, then we specify one source, and the second target. And if we have a master-master, then both will be source, and there will be no target'es at all.

Option for MASTER -> SLAVE:

 # bucardo add dbgroup mydb_servers_group mydb_node1:source Added database "mydb_node1" to group "mydb_servers_group" as source # bucardo add dbgroup mydb_servers_group mydb_node2:target Added database "mydb_node2" to group "mydb_servers_group" as target

Option for MASTER <-> MASTER:

 # bucardo add dbgroup mydb_servers_group mydb_node1:source Added database "mydb_node1" to group "mydb_servers_group" as source # bucardo add dbgroup mydb_servers_group mydb_node2:source Added database "mydb_node2" to group "mydb_servers_group" as source

Everything! We have written what is the base. It is written what tables are in them. It is written who in what group. It remains to say the final touch - to say which group of tables will “ply” between the bases of which group. In other words - create a "sync":

 # bucardo add sync mydb_sync herd=mydb_herd dbs=mydb_servers_group Added sync "mydb_sync"

We can see what we did:

 # bucardo list sync Sync: mydb_sync Herd: mydb_herd [Active] DB group mydb_servers_group: mydb_node1 (source) mydb_node2 (source  target -  )

After changing the settings, be sure to restart Bucardo:

 # bucardo restart

========

Check: on the first node1 node we run:

 # psql -U postgres mydb -c "INSERT INTO mytable VALUES (1, 'a');"

and on the second node2 we check:

 # psql -U postgres mydb -c "SELECT * FROM mytable;"

who made multimaster, that should be checked in the opposite direction. Create on node2, and check on node1.

========

Questions that most people will have:

1) What will happen to the table on the target-base if the table on the source-base was changed while Bucardo was turned off or the network was unavailable?

The answer is: Ok. When starting or when the network appears, Bucardo will transfer data to the target server. So the target server can break as you like. The only requirement is that it should have the same data scheme (table structure) as the first one.

__

2) If the base is large (tens to hundreds of gigabytes), Bucardo "breaks off" and does not synchronize to the end. How to be?

Answer: put sync in the inactive state. But Bucardo must be enabled for the source-base for logging queries.

bucardo update sync mydb_sync status = inactive (for multimaster on all nodes)
Next, do pg_dump / pg_restore with your hands and return sync to active mode (for multimaster, first on the one where the new requests went after starting the dump).

Source: https://habr.com/ru/post/327674/

All Articles