11 "recipes for cooking" MySQL in Bitrix24

While designing, developing and launching our new big project, Bitrix24 , we not only wanted to make a really cool service for teamwork (also free for up to 12 users), but also to gather and accumulate experience in using cloud computing. web services, to “pump” their competence in the development of high-load fault-tolerant projects and - most importantly - to share this knowledge with both our partners and all web developers who are familiar with the topic of “highway traffic”. :)

Of course, in one article (and even not in one) it is impossible to describe a universal “recipe” that would fit absolutely for all projects: performance is more important for someone (sometimes even at the expense of reliability), for someone it is, on the contrary, fault tolerance above all, somewhere many small tables, somewhere - a large amount of data ...
')
We tried to describe those "raisins" that have often helped us in our work in solving various practical problems. We hope they will be useful for you. :)

Let's start in order.

We have repeatedly said and wrote that "Bitrix24" deployed in Amazon. And since we love and actively use various cloud services, the first question ...

1. Why do not we use RDS?

Amazon Relational Database Service (Amazon RDS) - a cloud database. There is support for MS SQL, Oracle and - what was interesting to us - MySQL.

We watched for a long time, wanted to use it in our project ... As a result, we abandoned this idea. A few key reasons:

Firstly, the system is not flexible enough and transparent. For example, you do not get full root access to the database. And this means that some specific (specifically for your project) settings may not be able to be done.
There is a risk of long downtime in case of any accident. And practice shows that this risk is quite real. Yes, more often those bases are located that are located in only one zone, and so-called Multi-AZ DB Instances can be used to ensure fault tolerance. But - about them further ...
When using Multi-AZ DB Instances for a database, a constant standby backup is constantly written to another AZ (Availability Zone), and in the event of an accident in the first zone, an automatic switch to the other occurs. But in this case, one instance with the base is exactly twice as expensive. At the same time, the user effectively uses the resources of only one machine (in contrast to the scheme with “master-master” replication, which we wrote about recently ).

All this does not mean that RDS is a bad service and should never be used. This is not true. He did not fit specifically for us. And, perhaps, it will be much easier for someone to provide scaling and fault tolerance by means of Amazon.

2. Master-Slave? No, Master-Master!

The standard replication scheme in MySQL “Master-Slave” has been successfully used on many projects for a long time and solves several tasks: load scaling (read only) - redistributing queries (SELECTs) into slaves, fault tolerance.

But decides - not completely.

1. I want to scale and write.
2. I want to have a reliable failover and continue to work automatically in case of any accidents (in the master-slave in the event of an accident on the master, you need to switch one of the slaves in manual or semi-automatic mode to the role of master).

To solve these problems, we use "master-master" replication. I will not repeat now, we have recently devoted a separate post on Habré to this technique.

3. MySQL? No, Percona Server!

The first few months (on prototypes, in the development process, at the beginning of the closed beta testing of the service), we worked on standard MySQL. And the longer they worked, the more they looked at the various forks. The most interesting, in our opinion, were Percona Server and MariaDB .

As a result, we chose Perkonu - of course, because of a similar “inverted” logo. ;)

... and a few features that turned out to be extremely important to us:

Percona Server is optimized to work with slow disks. And this is very important for the "cloud" - the disks there are almost always traditionally slower (unfortunately) than regular disks in the "iron" server. The organization of a software raid (we use RAID-10) solves the problem well, but optimization from the server software is only a plus.
Restarting the database (with a large amount of data) is an expensive long operation. In Perkone there are a number of features that allow you to do a restart much faster - Fast Shut-Down, Buffer Pool Pre-Load. The latter allows you to save the state of the buffer-pool during restart and due to this get a “warmed” base immediately after the start (and not spend long hours on it).
Many counters and advanced reports (by users, by tables, advanced output SHOW ENGINE INNODB STATUS, etc.), which is very helpful in finding, for example, the most "weighted" clients system.
XtraDB Storage Engine - backward compatible with standard InnoDB, while much better optimized for highload projects.

A complete list can be found on the website in the Percona Server Feature Comparison section .

The important point is that the transition from standard MySQL to Percona Server did not require changing any application code or logic at all.

And, behold, the process of "moving" was quite interesting. And thanks to the use of the scheme with "master-master" replication , it went completely unnoticed by our users. Downtime just was not.

The move was as follows:

“Zero” step - they raised a test stand of their machine snapshot with a base with real data. And it began to debug the configuration for stable operation. And, of course, select Perkone-specific options in the configuration file.
Having received a stable configuration, they switched all the traffic to one DC, and turned off the replication base without load.
On this base, they made the transition to Percona Server.
Included base in replication (standard MySQL and Percona Server are compatible).
They waited for data synchronization.
After that, we did the same procedure with second base.

4. MyISAM? InnoDB?

It's simple.

On long queries in MyISAM the whole table is locked. InnoDB - only those lines we work with. Accordingly, in InnoDB, long requests have less impact on other requests and do not affect user experience.
On large amounts of data and under high load, MyISAM tables break down. This is a well-known fact. In our service, this is unacceptable.
Well, since we have an “improved” InnoDB - XtraDB, of course, we will use this particular storage engine. :)

* * *

Moving from architectural to more practical issues. :)

5. Do all data need to be replicated? No, not all.

Almost any project has non-critical for loss or recoverable data. Including - in the database.

In our case, such data was the session. What was wrong with replicating everything?

Along with SELECTs, there are many write operations in the session tables (INSERT, UPDATE, DELETE). This means that we give an extra load on the slave base. The load that can be avoided.
In addition, with a sufficiently large value of query_cache_size, we are faced with the fact that active work with these tables and their participation in replication lead to the fact that many threads “hang” in the state “waiting for query cache lock” (seen in SHOW PROCESSLIST). Further, it is fraught with increased load on the CPU and overall performance degradation.

Excluding this data from replication completely solved the problem.

How to exclude? There are different ways.

1. At the application level, in the connection where we work with the tables that we want to exclude from replication, we perform:

SET sql_log_bin = 0;

2. A simpler and more understandable way is to specify an exception in the MySQL configuration file.

 replicate-wild-ignore-table = %.b_sec_session

Such a construction excludes from replication the b_sec_session table in all databases.

Everything is a little more complicated if you need more complex logic. For example, do not replicate table tables in all databases except the db database.

In this case, you will have to draw a bit of schematics like the ones MySQL gives you to make the right combination of filter options.

6. Type of replication.

There is a lot of controversy about whether to use STATEMENT-based or ROW-based replication. And the one and the other option have both pluses and minuses .

By default, MySQL (Percona) 5.5 uses STATEMENT-based replication.

On our application in this configuration, we regularly saw in the line logs: "Statement may be safe . "

In addition, a comparison of the two databases in the master replication master showed that data discrepancies might appear. This, of course, was unacceptable.

MySQL has an interesting solution that completely suits us - to use the binlog MIXED format :

 binlog-format = mixed

In this case, by default replication goes in STATEMENT mode and switches to ROW just in case of such unsafe operations.

7. Replication broke down. What to do?

Replication sometimes still breaks. Especially scary (at first :)) it sounds when working with a "master-master."

In fact, there is nothing terrible. True. :)

First of all, it should be remembered that the described master replication scheme is actually just two ordinary master slaves. Yes, with some nuances, but most of the practices used in the standard scheme work here.

The simplest (and most frequently occurring) problem is the error “1062 Error 'Duplicate entry'” .

The reasons may be different. For example, in the event of an accident with a base, we switch traffic to another DC. If the request has already been executed in DC 1, but did not manage to replicate in DC 2 and was executed there again - yes, we will receive exactly such an error.

It is treated by executing the following commands on the slave:

 STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; START SLAVE;

Thus, we miss the extra request. Next, look at the status of replication:

 SHOW SLAVE STATUS\G

If required, repeat the procedure.

* * *

Yes, we are now considering the simplest option in detail. Everything happens much worse - the file system crumbles, files are beating, etc.

There is no universal recipe for “how to fix everything”. But it is always important to remember the following:

Do not panic.
See slave status with SHOW SLAVE STATUS \ G.
See the master status with SHOW MASTER STATUS .
Always log ( log-error = /var/log/mysqld.log ) - there is a lot of useful information in it. For example, data on the position to which the slave read the binlog master. It helps with accidents.
If everything is completely broken - go up from the backup.

8. How to raise from a backup one of the servers in the master of replication?

What to do if something went wrong in the scheme with two masters (for example, during the accident in Amazon a few days ago we had irreversibly damaged file systems on several servers)?

The “head on” solution is to transfer data from one server to another and start replication from scratch — too long.

In Amazon, we use disk snapshot mechanisms and image creation (AMI) of entire machines . This allows you to quickly deploy a full copy of the desired server - for example, as of several hours ago.

If we just deploy the machine from the backup, we will get an interesting effect: we will start reading the data from the live wizard binlogs (since the backup was created), but read only half of them, since the default records from the server with the same server- id (from the “future” relative to the backup time) will not be replicated. This is done in order to avoid "looping" in the "master-master".

We act like this:

1. All traffic goes to the "live" DC. There is no load on the server that we restore.
2. On the server, raised from the backup, we immediately stop mysqld and enter it into the config:

 skip-slave-start replicate-same-server-id #log-slave-updates = 1 ; !

3. We start mysqld and we start replication.
4. After the data is synchronized, we return the config to its original state:

 #skip-slave-start #replicate-same-server-id log-slave-updates = 1

5. Since we have a “master-master”, we need to start replication in the opposite direction. Stop replication on the server that we restored, and execute:

 SHOW MASTER STATUS;

If replication is not stopped, the data will change.
6. We start from the right position replication on the first (live) server:

 STOP SLAVE; CHANGE MASTER TO MASTER_LOG_FILE='...', MASTER_LOG_POS = ...; START SLAVE;

Enter the data obtained in paragraph 5.
7. We start replication and on the second server.

9. Where is the balance between performance and replication reliability?

There are many options in MySQL / Percona that can dramatically increase the reliability and security of data in the case of, for example, sudden reboots. Practically all of them just as drastically reduce the speed of the system.

We found a balance for ourselves in this combination of options:

 sync_binlog = 1 sync_master_info = 0 sync_relay_log = 0 sync_relay_log_info = 0 innodb-flush-log-at-trx-commit = 2

Binlog is crucial for us, therefore sync_binlog = 1 . But at the same time binlogs are stored on a separate disk in the system, therefore writing to this disk does not reduce the performance of the system as a whole.

10. How to evaluate the performance of the system?

If we have big "heavy" requests, then, of course, we simply focus on the time of their execution.

More often (and in our case, exactly), the system handles many, many small requests.

Of course, you can use various synthetic tests to evaluate system performance. And they will give some assessment. But you want to have some real indicators (preferably in figures :)), which could be used “in battle”.

Percona Server has a great tool:

 SELECT * FROM INFORMATION_SCHEMA.QUERY_RESPONSE_TIME;

 +----------------+-------+----------------+ | time | count | total | +----------------+-------+----------------+ | 0.000001 | 0 | 0.000000 | | 0.000010 | 6555 | 0.024024 | | 0.000100 | 56132 | 2.326873 | | 0.001000 | 23165 | 6.686421 | | 0.010000 | 9755 | 39.737027 | | 0.100000 | 1437 | 40.831493 | | 1.000000 | 141 | 31.785571 | | 10.000000 | 9 | 17.891514 | | 100.000000 | 0 | 0.000000 | | 1000.000000 | 0 | 0.000000 | | 10000.000000 | 0 | 0.000000 | | 100000.000000 | 0 | 0.000000 | | 1000000.000000 | 0 | 0.000000 | | TOO LONG | 0 | TOO LONG | +----------------+-------+----------------+ 14 rows in set (0.00 sec)

Such a histogram of the distribution of query execution time very well helps to assess the overall state of the system.

For example, we have defined for ourselves a certain critical threshold - no more than 5% of requests (of the total number) with a runtime of more than 0.01 sec.

To keep track of this state in dynamics, we wrote a simple plugin for Munin, which just draws a graph for this ratio. Very convenient, and - most importantly - it is a lively understandable metric.

11. Memory balance.

MySQL settings must be such that memory consumption is balanced!

It seems to be a simple and understandable rule, but it is often forgotten. I repent myself a couple of times (at the beginning, on the prototype :)) received OOM (Out of memory) and - as a result - the process killed by the operating system mysqld.

Ideally, the mysqld process should work in such a way that it fits completely in RAM and does not operate with a swap.

Required - all processes of the system should be placed in memory + swap.

Often, counting how much memory mysqld can consume is not obvious to many.

The formula is approximately as follows:

Global buffers size: key_buffer_size + tmp_table_size + innodb_buffer_pool_size + innodb_additional_mem_pool_size + innodb_log_buffer_size + query_cache_size
Buffer size for one connection: read_buffer_size + read_rnd_buffer_size + sort_buffer_size + thread_stack + join_buffer_size
Maximum possible memory usage: global buffers + connection buffers * maximum number of connections

If you do not really want to count :), you can use the mysqltuner.pl script, which in addition to this information will show a lot of other data on the system, security, performance, etc.

 # wget mysqltuner.pl # perl mysqltuner.pl

* * *

Thank you for reading this place! :)

We have considered only some of the practical issues and techniques that we use in the work of Bitrix24 . Including thanks to them, the service grows and develops.

We hope that our experience will help you in creating and developing your projects.

And now it is clear that the volume of one article is completely inadequate. In the near future we will try to continue the topic of using MySQL in large projects, share new recipes and describe the most interesting and popular topics in more detail.

Source: https://habr.com/ru/post/147098/

All Articles

11 "recipes for cooking" MySQL in Bitrix24

More articles: