Good day.
Some time ago, the project of our company for monitoring servers and sites moved from the “made for itself” category to the plane of attracting mass users. This was partly due to the project receiving seed investments from the IIDF and, of course, the desire to share our technology with the world, which we also use to monitor our servers. But, this is not an advertising post, but a practical one, therefore about the project later.
So, as the load on the database grows, and our service is a SaaS platform for collecting metrics from servers, the number of write requests to our database (now more than 1000 servers send about 20 of their metrics to the database every 4 minutes) began to overload DB and unstable service. This was often due to exceeding the established maximum number of connections to MySQL and a large server load. Unfortunately, all attempts at optimizing MySQL, increasing server resources and setting parameters max_connections, query cache, etc. did not lead to success.
')
Since we do not have a separate person responsible for databases, and programmers and system administrators cannot spend a lot of time every day maintaining MySQL stability and reacting to every crash, we decided to switch to MariaDB Galera cluster with master-master replication and load balancing HaProxy. We had no experience of implementing the database cluster in the production environment before, and therefore had to step on all the rakes ourselves.
Fortunately, on Habré there were many useful articles on the topic of configuring
Percona XtraDB ,
HaProxy and Zabbix for Percona , as well as a series of articles
"Perfect Cluster" , which helped us a lot in the initial installation.

Unfortunately, either the lack of experience with clusters, or lack of time (testing was carried out mostly at night, when the number of users in the system is minimal) led to the fact that writing data to the cluster could be done only through the main Master node, and writing in the mode roundrobin or leastconn caused the entire system to crash. We spent a lot of time eliminating this problem and were already ready to abandon the use of the cluster, since decided that the problem lies in the lack of cluster configuration skills.
However, further search for a solution to this problem led to the discovery of the Galera Percona and MariaDB cluster monitoring software from SeveralNines -
ClusterControl . And, although the complex itself has quite expensive paid service packages (from $ 1000 per year for each server), SeveralNines has both a free version and a 14-day period for testing the full version.
I will say right away that we also had to spend several hours installing this solution before we learned about the availability of
convenient configuration generators on the ClusterControl website, which allow you to fully automate the installation of both the cluster parts (nodes) and the monitoring system itself, and even HaProxy .
Below we present one of the configuration options that we used to prepare the MariaDB cluster.

After generating the configuration file, all you need is to download it to the server, from where the cluster will be managed, unpack the archive and run the deploy.sh file (installation instructions will be shown after generating the installation script).
When installing a cluster, it is important to consider the following:
- All servers in the cluster, including the monitoring server, must run on the same type of operating system.
- Installing the cluster on CentOS 6 and 7 did not lead to a positive result and constantly required revision, which took several hours (possibly due to shortcomings in the ClusterControl installation script). Installation on Debian 6 went smoothly the first time, so we recommend using Debian
- You can also always open a ticket in technical support of Severalnines, which will help with the installation free of charge.
- A cluster must have at least 3 nodes (servers) + a ClusterControl monitoring server to exclude split-brain situations.
After installation, you can go to the ClusterControl control panel, via: your server-ip-address / clustercontrol /
Next, you will need to go to the “Manage >> Load Balancers” item and install HaProxy on one of your node nodes. We recommend doing this through the control panel, not manually - so you will be sure that you have not forgotten to add anything to the cluster configuration files.
I note that thanks to this solution (we have nothing to do with the Severalnines product), we were able to ensure the uninterrupted operation of our service. However, in the near future we will conduct load testing of both the front-end and a database, and we will certainly share the results.
In conclusion, we hope that this article will be useful to those who first encountered the need to scale the database of their application and help you save some time and effort spent on installing the cluster. In the meantime, our team continues to work in the 6th recruitment of IIDF and in the near future we will also write an article about the experience of migrating our infrastructure to the Microsoft Azure cloud, of which we are now a partner.