We compiled the MySQL fork from Facebook with the RocksDB engine instead of InnoDB and tested it with real applications: Drupal, Wordpress, Redmine.
This is an awesome thing. With a low load gain is small, tens of percent. But with a high load gain at times. When RocksDB is added to a stable release in MariaDB, I’m sure that within half a year, half the people will switch from InnoDB to RocksDB. Especially small sites on cloud / VPS and dedicated servers.
What is so good about MyRocks? Line recording instead of random and reducing the number of disk operations in general. That is, database transactions generate fewer disk operations, take up less disk space, and are written much faster.
I gathered in the article the results of testing real-world Redmine scenarios, added analysis of the results and conclusions. Redmine on MySQL with RocksDB was faster than with InnoDB - from 20% with a minimum load up to 3 times at maximum. Later I will prepare materials on Drupal and other PHP applications.
You can check the work of MyRocks and yourself - at the end of the article there are links to installers and virtual machines with LAMP / LEMP / Ruby stacks, collected from MyRocks instead of MySQL.
Facebook, based on its storage, RocksDB has created a storage engine for MySQL - MyRocks . The working implementation of MyRocks in the form of MySQL 5.6 with patches from Facebook is transferred to open source and is hosted on GitHub - https://github.com/facebook/mysql-5.6 . MyRocks is an alternative to InnoDB and offers several advantages:
All this in total significantly increases the speed of transactions on the HDD and reduces the wear and tear of the SSD, as well as accelerates replication.
MariaDB and Percona are already working on integrating MyRocks into their MySQL fork : Facebook MyRocks at MariaDB , Announcing MyRocks in Percona Server for MySQL . MariaDB has announced that MyRocks will be available in release candidate 10.2 this winter. On Jetware , the original MyRocks implementation from Facebook based on MySQL 5.6 has been added to the MySQL alternatives.
Performance testing on synthetic tests shows impressive results. Depending on the type of storage device, the gain in speed ranges from 20% to 10x. LinkBench test results can be viewed in the Yoshinori Matsunobu MyRocks publication: A space- and write-optimized MySQL database and on the Mark Callaghan blog , for example, MyRocks: IO for reads . These tests are largely focused on large volumes of databases (tens and hundreds of gigabytes) and powerful machines.
In addition to synthetic tests and tests on large amounts of data, we decided to test and evaluate the performance gains for typical web applications and small sites.
First we test Redmine. We know how it works, we actively use it in development, and therefore testing has also practical value for us - if the result is good, we turn to MyRocks.
We use Redmine 3.3.1 with Ruby 2.3.1, in the default configuration, without additional plug-ins.
As database servers we use:
All binaries are compiled by one GCC 4.9.3 compiler, with recommended build and optimization options.
The operating system is Ubuntu 14.04 x86_64, Linux kernel 3.13.0. The file system is ext4.
Before executing the tests, the database is filled with pre-generated projects, users and tasks. There are three such options too:
Most real-world uses of Redmine come in sizes between Small and Big. Volumes of the Giant level are much less common.
Space occupied
Columns show the amount of used space for different databases and different data sets (less is better)
We imitate the work of Redmine in conditions close to working with a cloud provider or on an office server. To do this, we do not allocate the entire physical server, but place it in a virtual machine with a much smaller amount of resources and imitate a different load on the disk system from neighbors. As a virtualization platform, Xen 4.6 is used, in the dom0 - Linux kernel 3.16.7. The storage device is broken using LVM, normal linear, without thin provision and snapshots. The volume is located in the middle of the HDD.
Three virtual machine configurations were used:
We check the speed of work of the most frequently used operations in Redmine - creating tasks, adding comments to tasks, changing the status and person in charge of a task. From these operations, we created two tests, two task work cycles:
Task creation
Creating a task in a project by one project participant, assigning it to another project participant and adding 10 comments to it on behalf of other project participants.
Handling 10 tasks
The user receives 10 tasks assigned to him, converting them all to In Progress, and then sequentially converting the task to Resolved and assigning it to the creator of the task.
Tests are conducted on 1, 2 and 4 parallel Redmine processes.
The load is created using the fio
utility, which reads and writes 50/50 random blocks to the rest of the disk. We simulated several levels of disk load, which are typical for typical cases of virtual machines - from providers of public clouds and VPS, or when running multiple virtual machines running VMWare, Hyper-V, KVM or XenServer on our own server.
To simulate incomplete booting, we run fio with IOPS restriction using the --rate_iops
key and measure disk utilization. At 100% single-threaded load, this is about 80 IOPS. The disposal of 25% is generated by a load of 14 IOPS. A large load is simulated by increasing the number of threads with the --iodepth
.
Depending on the number of neighboring virtual machines, the nature of their work and the peaks of the load, the disk load may be significantly different from cloud providers and VPS, as well as on their own server. Therefore, we conducted testing in the absence of third-party load, with an insignificant single-threaded load (14 IOPS, 25%), and with full third-party loads in 1, 2, and 4 threads.
We measure the total execution time of each Redmine operation on a large number of operations and compare the average execution time. The first 10% of the results are ignored - we heat the system on them. The last 10% of the results are ignored in order to eliminate tail distortion due to different completion times of parallel processes.
Measurements are carried out in different combinations of conditions:
Runtime is measured for all three databases - MyRocks MySQL, MySQL and MariaDB. We also calculate the speed difference of MyRocks MySQL in relation to MySQL and MariaDB. The collected data is presented in the form of graphs.
Timelines for operations
1) creating a task; 2) processing 10 tasks
The columns show the time of the operation (less is better). The lines show how many times the MySQL or MariaDB server was slower than the MyRocks MySQL server.
The minimum load is one Redmine process and there is no third-party load. Maximum load - 4 Redmine processes and 4 full threads of third-party disk load.
We see that the task creation time for MyRocks with increasing load to the maximum changes slightly, and increases from 0.018 sec to 0.023 sec, by 23%. For MySQL and MariaDB, the minimum task creation time is 0.022 seconds and increases tenfold to 0.23 seconds at maximum load. At minimum load, MySQL and MariaDB are 24% slower than MyRocks; at maximum load, they are 9.5 times slower.
Task processing time for MyRocks rises from 0.245 sec with a minimum load to 0.327 sec with a maximum of 33%. For MySQL and MariaDB, the minimum task processing time increases by about 7 times - from 0.283 sec with a minimum load up to 2.245 sec with a maximum.
The amount of RAM is not enough to effectively read caching, and this greatly affects the speed of InnoDB.
Timelines for operations
1) creating a task; 2) processing 10 tasks
The columns show the time of the operation (less is better). The lines show how many times the MySQL or MariaDB server was slower than the MyRocks MySQL server.
The minimum load is one Redmine process and there is no third-party load. Maximum load - 4 Redmine processes and 4 full threads of third-party disk load.
In this configuration, virtual machine resources are better matched to the amount of data and load. For MyRocks, the task creation time remains the same - from 0.018 sec to 0.023 sec, growing by 23%. For MySQL and MariaDB, the minimum time becomes slightly longer - 0.023 sec and grows only twice - to 0.056 sec at maximum load. They are slower than MyRocks, by 30% at minimum load, and 2.3 times at maximum.
For handling tasks the situation is similar. The runtime for MyRocks increases slightly from 0.248 sec to 0.331 sec with increasing load. For MySQL and MariaDB, the minimum time is already 10% longer than for the Small dataset and is 0.296 seconds. At maximum load time is almost doubled - to 0.595 seconds. MySQL and MariaDB are slower than MyRocks, by 18% at minimum load and 80% at maximum.
Timelines for operations
1) creating a task; 2) processing 10 tasks
The columns show the time of the operation (less is better). The lines show how many times the MySQL or MariaDB server was slower than the MyRocks MySQL server.
The minimum load is one Redmine process and there is no third-party load. Maximum load - 4 Redmine processes and 4 full threads of third-party disk load.
A tenfold increase in data volume slightly increased the task creation time for all databases: 0.020 seconds for MyRocks, 0.026-0.029 for MySQL and MariaDB. Increasing the load slows MyRocks by 35% to 0.027 sec. For MySQL and MariaDB, the load increase affects the speed more - at maximum load time increases by 3 times - up to 0.088 sec, and they turn out to be slower than MyRocks, by 3.2 times.
When processing tasks, the execution time of MyRocks is increased by 32%, from 0.255 to 0.33 seconds. MySQL and MariaDB time increases 4 times - from 0.309 to 1.242 seconds. And they lag behind MyRocks by 3.8 times.
The volume of data has already grown to such a size that delays in random writing begin to affect the updating of InnoDB indexes and the difference in speed between RocksDB and InnoDB at maximum loads has increased again.
For the operation of Redmine, the volume of 1 Gb is the minimum recommended. For efficient data caching in page cache, the memory size is already insufficient, so the speed is very sensitive to the load on the disk. Delays already occur on SELECT queries, since they have to read data from a disk for them. RocksDB's smaller amount of data storage has led to more efficient read caching than for InnoDB. Therefore, even with a large load, the speed of operations at MyRocks changed only slightly.
When increasing the memory to 2 Gb, the main data used already fit into the page cache and the database server no longer needs to constantly read it from the disk. In this case, the disk is a narrow neck only with changes in the base. Transactions are written to the disk without a writeback cache, and the intensive disk load increases the time to wait for the write to complete.
The organization of data storage in RocksDB, contributing to linear recording, and the reduced amount of recorded data, reduce the number of write operations. Therefore, we observe that even with a high disk load, the transaction speed in RocksDB only slightly decreases and significantly exceeds the speed when using InnoDB.
Based on the principle of operation of RocksDB, we expected acceleration in the execution of transactions. On synthetic performance tests, developers received a 10-fold increase in speed in the work of the DBMS. For applications such as Redmine, the execution time of the operation consists of the execution time of the Ruby script and the execution time of the query in the database. Of course, replacing the storage engine with a RocksDB will not increase the speed of Ruby, and this component remains unchanged. But even with this in mind, the increase in speed due to the acceleration of the database turned out to be impressive.
Here we present the marginal test results for the 2 Gb virtual machine and the Big dataset, and for the 8 Gb virtual machine and the Giant dataset. We do not consider high load testing for a 1 Gb virtual machine here, since this is a case of extreme resource shortages.
Timelines for operations
1) creating a task; 2) processing 10 tasks
Columns show the time to complete the operation (less is better)
Minimum (1 Redmine process without third-party disk load) and maximum load (4 Redmine processes and 4 full third-party disk load)
With a low load, Redmine on MyRocks turned out to be 15% -25% faster than MySQL and MariaDB. The size of the stored data has little effect on this speed for both RocksDB and InnoDB - an increase in the number of Redmine tasks increased the execution time by about 10% by 10 times.
With a high load (increasing the number of parallel processes and increasing third-party disk load) behavior changes completely. The gap MyRocks has become more - from 2-fold to almost 4-fold . The size of the stored data also began to significantly affect the speed - a 10-fold increase in the number of Redmine tasks noticeably (1.5–2 times) slowed down the execution speed on servers with InnoDB, and slowed down less noticeably on RocksDB (0-15%).
The simultaneous increase in data volume and high load slowed down the work of Redmine with MyRocks by 1.5 times, while Redmine on MySQL and MariaDB became 4 times slower.
During testing, we discovered a nuance of the behavior of one of the SQL-queries of Redmine when searching for considering parent issues. Because of him, some kinds of search turned out to be slower in MyRocks. But this small omission from Redmine - parent_id
did not have an index in the table. We also encountered a minor bug that caused the CPU to consume after some conflicting transactions in MyRocks.
We did not face other problems. According to the developers, Facebook has been using MyRocks in production for a long time.
You can use MyRocks now or wait for more extensive testing after the appearance of MyRocks in the MariaDB release candidate 10.2 or in Percona Server for MySQL. The MyRocks package is available in the Jetware repository as one of the alternatives to mysqld
in stack constructors , for example, PHP LAMP / LEMP), Ruby RAMP / REMP, or applications, for example, Redmine.
A few weeks ago, we transferred our internal Redmine-server to MyRocks and are working successfully there.
In this test, we checked the performance of MyRocks for the Redmine application. In the following tests we are going to check the performance of MyRocks with PHP applications. Most likely, the first will be Drupal.
Source: https://habr.com/ru/post/319500/
All Articles