
In the previous
article, I talked about a project to
automate the deployment of Docker containers , the development of which started earlier this year. Several months have passed, Fabricio has been significantly improved and improved, and today I want to tell you about one of the latest innovations - about the automatic deployment of master-slave configurations for PostgreSQL.
Running PostgreSQL in containers is not the most popular idea, and there is a rational explanation for this: there is no need to add
additional network delays to an already rather loaded service. But there are a number of cases when such a solution can still be applied. For example, when
you fully trust the Docker DB, it does not experience serious workloads, but the possibility of duplicating / replicating stored data to several servers is important. Or just to test and debug the settings before applying them to the combat servers.
')
In order not to bore the reader (and user) with a large amount of textual information, I decided that it would be nice to give “live”
examples of using Fabricio on actually working containers — you will agree — it's better to see it once.
The master-slave configuration example for PostgreSQL is implemented on three virtual machines in which Docker is installed. The creation and initial setup of these virtual machines are fully automated with Vagrant, so there should be no difficulty with running the example.
Implemented scripts
The most important scenario in the case of the master-slave configuration is undoubtedly the recovery of the database after the master (master) server fails. As a rule, on combat systems this is configured using systems of permanent monitoring of existing servers and automatic switching / exclusion of failed hosts - failover. For example, for these purposes you can use the
pgpool-II tool,
which is popular today. However, setting up such systems is not a trivial task (and not all of them are completely reliant on automation), so you can often find configurations that do without automatic recovery after a crash. If a failure in such systems still occurs, then it is eliminated, as a rule, in manual or semi-automatic mode by restoring the database from the backup copy and / or switching the application configs to the address of the new database master.
Fabricio offers a semi-automatic way to recover from a master failure. To do this, you need to execute just one command, after making sure that the failed server was replaced with a new one, or deleted from the configuration of the deployment configuration:
fab --parallel db
Not much different from the above option, the scenario of adding a new slave server to the configuration, for this it is enough to write the address of this server in the list of hosts of Fabricio and re-launch the deployment command. The initial state of the database will be copied from the current master server.
Idempotency test[vagrant@192.168.1.85] Executing task 'update'
[vagrant@192.168.1.86] Executing task 'update'
[vagrant@192.168.1.87] Executing task 'update'
[vagrant@192.168.1.85] Found master: 192.168.1.85
[vagrant@192.168.1.85] download: <file obj> <- /data/postgresql.conf
[vagrant@192.168.1.85] /data/postgresql.conf not changed
[vagrant@192.168.1.85] download: <file obj> <- /data/pg_hba.conf
[vagrant@192.168.1.86] Waiting for master info (10 seconds)...
[vagrant@192.168.1.85] /data/pg_hba.conf not changed
[vagrant@192.168.1.85] run: docker inspect --type container postgres
[vagrant@192.168.1.87] Waiting for master info (10 seconds)...
[vagrant@192.168.1.85] run: docker inspect --type image postgres:9.6
[vagrant@192.168.1.85] run: docker start postgres
[vagrant@192.168.1.86] download: <file obj> <- /data/recovery.conf
[vagrant@192.168.1.87] download: <file obj> <- /data/recovery.conf
[vagrant@192.168.1.87] /data/recovery.conf not changed
[vagrant@192.168.1.86] /data/recovery.conf not changed
[vagrant@192.168.1.87] download: <file obj> <- /data/postgresql.conf
[vagrant@192.168.1.86] download: <file obj> <- /data/postgresql.conf
[vagrant@192.168.1.87] /data/postgresql.conf not changed
[vagrant@192.168.1.86] /data/postgresql.conf not changed
[vagrant@192.168.1.86] download: <file obj> <- /data/pg_hba.conf
[vagrant@192.168.1.87] download: <file obj> <- /data/pg_hba.conf
[vagrant@192.168.1.87] /data/pg_hba.conf not changed
[vagrant@192.168.1.86] /data/pg_hba.conf not changed
[vagrant@192.168.1.87] run: docker inspect --type container postgres
[vagrant@192.168.1.86] run: docker inspect --type container postgres
[vagrant@192.168.1.87] run: docker inspect --type image postgres:9.6
[vagrant@192.168.1.86] run: docker inspect --type image postgres:9.6
[vagrant@192.168.1.87] run: docker start postgres
[vagrant@192.168.1.86] run: docker start postgres
[vagrant@192.168.1.87] No changes detected, update skipped.
[vagrant@192.168.1.85] No changes detected, update skipped.
[vagrant@192.168.1.86] No changes detected, update skipped.
Done.
Disconnecting from vagrant@127.0.0.1:2222... done.
Disconnecting from vagrant@127.0.0.1:2200... done.
Disconnecting from vagrant@127.0.0.1:2201... done.
For test needs, Fabricio also supports configuration deployment from scratch, selecting wizards randomly from the list of available hosts.
Features of the implementation
Automatic wizard detection requires running Fabricio in parallel execution mode, which is disabled by default. This is what the option
--parallel is used in the deployment team.
If at least one of the slaves has its own non-empty data folder (determined by the presence of the PG_VERSION file), automatic selection (promotion) of the new master is not performed by default (the script ends with a corresponding error). Although this procedure is quite safe, it is still recommended to familiarize yourself with the algorithm for selecting a new wizard before enabling this option. And then the master is selected in such a case randomly among those hosts that have at least some data, while the data on the “empty” hosts (that is, not having their own database) will be copied from the new master.
Also, without a special instruction, a rollback to the previous state does not work, due to the fact that the master-slave configuration does not provide for such a scenario at all - after all, the master does not automatically fix it, and if any error occurred, then it is better to repair it manually, and not rely on automation. If the rollback option is still enabled, then the rollback logic inherited from the parent class (PostgreSQL single configuration) will be used - returning the previous configs and / or the previous container (or containers) depending on what was updated during the last successful deployment.
Future plans
In the very near future, most likely by the end of the year,
support for Swarm mode in Docker version 1.12 or higher will be implemented. This will make it possible with Fabricio to deploy not only individual containers, but at once whole services with automatic scaling and fault tolerance.
After implementing the solution for Swarm, it will be logical to start supporting Kubernetes and / or Mesos. But there is no separate task for this yet, and everything will depend on the complexity of implementation.