
I recently attended the wonderful 2016 Percona Live Conference in Santa Clara. I would like to write a lot of words of praise to the organizers for excellent working Wi-Fi, food, and exact following the schedule, and preparing the halls. But still, I am not writing an article for a tourist site, but for a technical one, so I’ll just tell you about the most interesting reports I have visited.
Surprisingly for such a narrowly focused conference, the range of reports was not limited to MySQL alone,
as it might seem , but covered in general the tools for working with data. The place was found and Hadoop with an ecosystem and column databases, and clouds (where now without them).
Deploy GTID Replication
Starting with MySQL version 5.6, such a wonderful thing as
GTID replication appeared in MySQL. The manual says “many letters” about how this replication works, but there is practically no information on why it is needed.
Let's imagine that someone needed to do cascading replication of data. This may be required if part of the replica cascade is located in another data center. To save traffic between data centers, only one replica pulls a copy of the data onto itself, and local slaves are updated from its logs. In general, not the worst and quite working scheme. But it contains one little problem.
You can not just take and change the master from the slave. For example, if one of the nodes fails, then all the nodes of the branch will have to be initialized again, i.e. Deploy a copy of the new wizard and start replication from it. And this is already expensive and long.
To prevent this from happening, a new replication log format was proposed, which allows the slave to continue replication from other slaves. Those. The replication log recorded on the slave will completely duplicate the log from the wizard.
In more detail about the work of this mechanism, as well as how to enable it for a large project (I remind you, they are told by Dropbox), you can learn directly from the report.
Rolling out Global Transaction IDs at DropboxMaslow's Pyramid for DB
In a very satirical and practical form,
Charity Majors said that it actually defines a pyramid of needs and survival when choosing a database for the project. Each of the points is supported by an excellent illustration, such as:
Maslow's Hierarchy of Needs for Databases')
MySQL Partitioning
For many, many years in MySQL there is the possibility of organizing partitioned tables. For what it is necessary, I hope, it is not necessary to explain. However, due to some implementation features, developers often bypass this mechanism. In general, an ax to be afraid - do not chop wood. Rick James proposes to figure out how this tool can be used for its intended purpose and under what restrictions the partitioning will work well.
Here are some examples of tasks where you can win from partitioning:
- Sliding time;
- 2d index;
- Import export.
PARTITIONing - How-To vs. Don't-botherReplication of user shards on Facebook
Daren Seagrave from Facebook revealed some features of the organization of their system of user shards and talked about how they move between servers and data centers, what path they took to get a uniform and efficient use of the server pool. The most unusual decision, in my opinion, was the fact that they first determine where to transfer the data, and only then - what needs to be transferred. Despite the fact that technically the shards themselves are a MySQL server, almost the entire report is applicable to any databases.
Everyday We're Shuffling - Online Shard Migration at FacebookFacebook Database Backup
Shlomo Priymak and
Dan Reif from Facebook talked about how they organized a system for storing backups of custom shards. Due to the fact that all shards are small, the process of backup of a single shard occurs quickly. As you know, a backup cannot be considered a backup until we are convinced that it is possible to turn around from it. That is why Facebook has organized a system that constantly checks that the backups taken can be used to deploy the database.
The second technical feature of the system turned out to be an interesting idea on the organization of incremental backups. Honestly, I have never even heard anyone take an incremental backup of a database. The idea for its implementation was fantastically simple.
They also mentioned how they implemented and organized storage in Hadoop. And also how they refused to calculate this "dif" in Hadoop. In general, I consider this report the most useful and interesting 8).
Massively Distributed Backup at Facebook ScaleLinux performance
Netflix's
Brendan Gregg presented an excellent blitz on what subsystems Linux is made of. What utilities can be used to obtain information about each of the subsystems in order to understand if there is a “bluntedness” there.
He also presented his list of commands in order to collect the necessary information about the state of the server in 60 seconds. I believe that from this report every devops will get something new and useful for themselves.
Linux Systems PerformanceRetrospective BI Development in Badoo
And of course, I can’t keep silent about the Badoo report at Percona Live 2016. In the report we told how our business intelligence system developed. What difficulties we faced, how they were solved, which technologies and with what amounts of data worked, and how we chose a database for analytics.
At the end of the report, we told that the most urgent task for us is the problem of data complexity (hundreds of tables) and how we are going to solve this problem.
BI at Badoo - historical retrospectiveFixing MySQL Bug # 2: now MySQL makes toast!
The incredible happened! After 14 years, we finally fixed
bug number 2 . Right before our eyes, MySQL made a toast!
Results
By and large, the whole post is a big thank you to the organizers for the program of the conference held. Additionally, I want to point out various trivia that helped not to break away from the conference itself: food almost without queues, excellent Wi-Fi, water in the coolers, empty seats and the ability to charge gadgets in the classrooms for reports, a funny quest to collect stamps in the exhibition area
Of course, the
Percona Game Night .
Alexey Eremikhin (@alexxz), BI Development Team Leader