“If you want your bugs to be fixed, write the most accurate test cases” - Sveta Smirnova about searching for MySQL bugs

The main character of our interview today rarely needs to be introduced. Sveta Smirnova , Percona Technical Support Engineer, is an expert in diagnosing problems and optimizing performance of MySQL, an active member of the Russian and international Open Source community, a regular speaker at major industry events, the author of one of the most relevant books on MySQL, “MySQL Troubleshooting”. At the upcoming summer day PG Day'17 Russia, Sveta will conduct an intensive training course on debugging MySQL performance and give a lecture on debugging replication .

On the eve of PG Day, we talked with Sveta about the intricacies of the reporting process and eliminating bugs in MySQL, the latest trends and trends in the world of popular forks, history of JSON support functionality, and prepared a selection of materials, blogs and books that are useful to all MySQL professionals.

Exclusively for PG Day, a special section of the training course will be dedicated to Performance Schema . With its help, you can debug stored procedures, track where MySQL server memory is used, view current variables separately for each connection, debug queries, diagnose locks and much more. Sveta will tell you how to configure the Performance Schema and choose the right tools for it for your tasks .
')

PG Day: Tell me a little about yourself. Who are you, what are you doing? How long have you been working in the profession, what qualifications do you have?

Sveta: I have been working as a MySQL technical support engineer for more than ten years. I started at MySQL AB, which Sun then bought, then Oracle. Now doing the same thing at Percona. Before becoming a MySQL technical support engineer, I worked as a web developer. In recent years, I have had projects related to the development of CRM systems for closed customers. These were private, invisible things from PHP and MySQL. As a web developer, I used different databases. When I had the opportunity, I chose MySQL.

It is often thought that technical support is when a person is sitting on the phone and answering stupid questions. In fact, in the case of MySQL it works a little differently . If we measure in levels, then we are L3 (I don’t know if there are levels higher - I don’t care about numbers). We do completely different things: we often look at the code , sometimes even write corrections . We monitor all sorts of situations - from the wrong behavior to performance problems and MySQL crashes. More specifically, I am working on the priorities of bugs . In Oracle, I worked closely with the MySQL Enterprise Backup Team . I continue to be interested in such things as backup solutions (out of old habit).

PG Day: Since we are talking about bugs, ask a provocative question. Very often, people complain that the bugs in MySQL do not rule for years. How true is this? How correctly to make the report on an error that in Oracle a bug corrected faster?

Sveta: I'm not at Oracle now. She made a special five-minute report on how to report problems so that they would finally be fixed. In the world of MySQL (in Percona, Oracle), the support group deals with bugs at the first stage. Why support, not a developer? Because when you post a bug, you need to make sure that it is reproducible, that this is really a mistake , and not some kind of expected behavior, just not documented. That this bug is relevant in the latest version, and you have provided enough information to reproduce it. It takes a lot of time. The better you get from the outset to provide a reproducible test case, the faster you answer these questions, the sooner your bug will either be in the Verified status in the MySQL Bugs Database, or become Confirmed in the Percona Bugs Database - and go to the next stage.

At the next stage, there are two ways. All bugs go through the so-called Triage process : each bug is assigned a priority in accordance with the plans of the developers. Critical errors fix quickly. If the developer realized that the error could create problems for a large number of users, it would be fixed in a short time. But not all bugs are like that. I have had cases in practice where, even despite the large number of customer complaints, the bug looked like a bullshit to the developer - they say there is a workaround. Nobody thought that the client has a thousand servers, and in order to use this workaround on each of them, you need to spend a lot of time and effort. It is not always clear.

For such cases, both in Percona and in Oracle there are committees on escalating bugs for customer support . If you have a support contract, you can write: “This bug affects me, I want to be fixed.” Percona has fewer such bugs. But the development team is smaller. This may be something decisive factor, in some - no. For example, there is a bug 989 . He was fixed for ten years. To fix it, I had to implement three worklog'a, completely rewrite part of the runtime engine . It cannot be called a feature request, but they made new features, implemented metadata locks to fix it. It also happens.

The general rule: if you want your bugs to be fixed, write as neat as possible test cases. In this case, from the moment you filed the bug, to the moment when it became confirmed / verified and got into development, the prioritization process went through, the time will be minimal.

PG Day: A question about various MySQL options. What are the current development trends, and in what direction is it moving? Can you comment on this?

Sveta: I am a conservative person. There are three main MySQL options. The first is upstream, which makes Oracle: Oracle MySQL Server . The second is Percona Server , which is 100% compatible with the previous one. You can take data from Percona Server data dir and use it in Oracle MySQL Server, and vice versa. There are still some additional features and optimizations when it works faster than Oracle MySQL Server. There are features such as verbose slow log , or audit log , which exists only in the enterprise version of Oracle MySQL Server. There is XtraBackup - by the way, it works with Oracle, Percona, and MariaDB. MariaDB - the third option. I don’t even know if he can be called a fork now, because they have gone their own way. For example, a separate team deals with the optimizer. You can write the same queries as in Oracle MySQL, but the optimizer will make plans differently, and they may differ.

As for replication , for example, their GTIDs are solved completely differently than in Oracle MySQL Server. They will not be able to read GTID from Oracle MySQL Server or from Percona Server. You cannot take the data dir MariaDB and use their GTID in Percona or MySQL servers. Strictly speaking, MariaDB will not give an error if it sees the GTID in another format: it simply ignores it. The reverse is not true. They go a little to the side, but still migrate to them, in principle, easy. But it is already more difficult to migrate back , because they have features that are not found in other servers, or the same features are implemented differently.

There were projects like Drizzle , such a minimalistic MySQL Server, of which only the core remained. But he is not developing. There are fork companies, for example, fork Facebook or fork Alibaba . They are used by these companies. Alibaba, for example, has physical replication . I still do not know whether they opened the source code or not, or still promise. But some things that are not in other servers, you can try. However, they are more focused on these companies.

There is a storage subsystem MyRocks , the development of Facebook. She already has both Percona Server and MariaDB. Percona Server has a trial build, is available in the testing area, technical support for MyRocks will be provided by Percona and MariaDB. There are some more forks like Amazon - they have their own Aurora . But their sources are closed. MySQL sources are licensed under the GPL, and as long as you do not want to distribute the product, you can use it. They use this moment in the GPL license. But in Amazon Aurora it’s impossible to see the source code, it’s quite difficult to find out things from outside.

PG Day: Proprietary engines are ideologically the development of some established engines like InnoDB, or are these radically new developments (MyRocks, for example)?

Sveta: MyRocks is not a proprietary engine, its source code is open, it has recently become a GPL. How did it start? InnoDB indexes are stored in B-tree . This is great for reading. But it is necessary to rewrite the "tree" every time. To improve this, LSM-tree was invented in RocksDB. In it, data is first written to memtable , then to write-ahead log . In general, there are two implementations of RocksDB: for MongoDB and for MySQL. MyRocks is a subsystem for MySQL, it is available in open-source for a long time, on Facebook's GitHub. It could be used with the Facebook tree. Now it is available in the repositories MariaDB, Percona. Features are being made on the Facebook side, but it will be supported by either MariaDB or Percona. By the way, we too will do features.

PG Day: Is InnoDB still the standard MySQL Server engine?

Sveta : Yes.

PG Day: Has it achieved stability, or is there something to improve?

Sveta: There are certainly moments to improve. But he is the most stable of these engines. As a support engineer, it's easiest for me to work with InnoDB. Perhaps due to the fact that I worked the longest with him, and not with new engines. Perhaps simply because the user base is larger. Suppose it is known that compression in InnoDB is worse than in the alternative engines TokuDB and MyRocks. In version 8.0, they promised to correct the situation when, under a certain load, read committed starts to produce worse performance than repeatable read, and noticeably worse. They promised to fix it in 8.0, but for now it is not available. Showed benchmarks, but the code is not yet available. Of course, there are still some things that can be improved in InnoDB.

PG Day: You mentioned that you as a support engineer often work for InnoDB. Are there any features of the new versions that you regularly use?

Sveta: My favorite feature I like to talk about is the Performance Schema . Unfortunately, there is such a thing: when you are a developer, you can install any version and say: “We need to upgrade.” And when you are a support engineer, different stories happen with old versions. Recently, there was a client with version 5.1, which has not been supported for ten years, and which has already reached the end of life.

I really want everyone to upgrade to 5.7 , because in 5.7 there is a new Performance Schema toolkit. In earlier versions, it was not clear where the memory is being spent — whether in replication storage threads, or in InnoDB, or anywhere else. This information is now available. Now you can get more information about locks . Absolutely new, which was not previously available - it is blocking metadata . There are tools for viewing status variables in the session. When the server does some work, you can see how many rows of data the engine writes, or how many temporary tables are created. But if you have a thousand connections, then you know that the whole thousand is just the sum. If it is interesting to see one of the most active compounds, then before 5.7 this was impossible, and in 5.7 it was possible. In 5.7, there are many Performance Schema tools. Now this is my favorite tool. I really want everyone to switch to the new version.

PG Day: Speaking of the transition. There are still people who have old versions. Is this banal laziness and irresponsibility, or are there any objective difficulties with the transition from the old versions to the newer ones?

Sveta: Difficulties with the transition are mainly such. If you have a company that should work 24 hours, 7 days a week, then the question arises: how to minimize downtime ? Second moment. When upgrading from major to major, there are incompatibilities . There are two types of incompatibilities: the data storage format and those associated with the optimizer building the query execution plans.

Incompatibilities of the first type do not apply to very serious cases. When upgrading from version 5.1 to something newer, you have to do dump / reload . It is very slow. While a simple upgrade from branch to branch is simple: take the old data dir, start the new version and start mysql_upgrade. In recent versions, Oracle is making this process happen with the least effort. But sometimes there are still problems. Either you have to rebuild the table, write ALTER TABLE , ENGINE = InnoDB , which is also, in general, not fast.

Another moment. Requests performed by the application are optimized for the version for which they were originally written. Interestingly, bugs are fixed, the optimizer works better, and some queries work slower after the update. I had a classic case. One client writes: “We were updated, and our request began to work slowly.” I say: "Send your log." He sends - and they have all requests written using F ORCE INDEX . I say: "Try removing FORCE INDEX." He removed and says: "Yes, now they are faster than the old version." All this will have to do.

PG Day: You are the author of the “MySQL Troubleshooting” book on debugging and solving MySQL problems. Who can she be useful for?

Sveta: I worked in MySQL support, worked with bugs. People asked the same questions. I thought that I needed to gather information in one place about what to do if there were problems with MySQL. I wanted to solve more complex problems, and not to answer the same questions. The idea of the book is as follows. You know how to use MySQL, write queries, start. But you have difficulties: incorrect data is returned, it works slowly. What to do? I wanted the book to answer this question.

PG Day: This book is for a wide range of people. Anyone who deals with MySQL should read it to stop asking stupid questions?

Sveta: Basically, yes. Not necessarily stupid, but just being able to figure out what to do. Unfortunately, the book was released in 2012, and there are no recent features in it. But the principle is still relevant - the book should be opened to understand what to do if you have some kind of problem with MySQL.

PG Day: Is the book being planned to be re-edited, the second edition is about problems with new features?

Sveta: I'm not doing this right now, but I am planning.

PG Day: By the way, about new features. As far as I know, you contributed to the issue of implementing JSON support in MySQL . Tell me about this experience. Now it is a very fashionable trend in the world of databases.

Sveta: Actually, this is all very interesting. JSON support - worklog, which appeared before I became a technical support engineer for MySQL (this was until 2006). It described what features MySQL should support. But someone had to move it, and did a project inside the support. I thought, why not tackle this issue? MySQL did not want to repeat the experience with XML functions . Functions were implemented, but did not work exactly as the community wanted. We decided that I would design prototype as UDF functions . This design prototype was available as a MySQL Labs project. People used, sent bugs. Subsequently, the Optimizer Team should, based on design prototype, implement functions at the server level so that they were optimized and run quickly.

In 2012, I made the first version, she went to the internal review. Then they told me that everything had to be redone. I redid it and in 2013 finally showed it to the public. At this time, requests for new features came: functions json_depth and json_length (at the request of the community) appeared. It was not some kind of my invention or company. I was busy fixing bugs, filling the new functionality. The design was "polished". Finally, the design prototype of the SQL standard for JSON appeared. I don’t know if it is approved now or not - it stopped following this. The syntax has been implemented.

UDFs initially had a somewhat weird syntax. But I didn’t work specifically on syntax, because then it wasn’t clear what would happen to the standard. If MySQL was going to inject this into the server code, then I wanted it to turn out not to be very specific for MySQL - so that the syntax was just enough to use. When everything is tested, the optimizer team made the data type. It is not pluggable . I could not do it without their help.

PG Day: JSON development continues, are features added?

Sveta: Yes. In version 8.0.2, which has not yet been released, they want to make it so that for the JSON_SET and JSON_REPLACE functions there is an in-place update. What is going on with JSON right now? As a rule, JSON is a large, long document (say, 1 kilobyte). Suppose you want to change some element - for example, the name of the book. Now we have to completely rewrite this 1 kilobyte. It is not very effective. Rewrite 1 kilobyte - not scary. But if you have to rewrite many such lines, it becomes inefficient. In 8.0.2 they announced that they had made an in-place update - that only a small part of the JSON document would be updated. This suggests that JSON is evolving, and there will be something interesting.

PG Day: Thank you so much for such a detailed and thorough story. I want to ask you to give a small announcement of the master class, which you will give in the summer on PG Day. It will focus on debugging MySQL performance. The master class is scheduled large and surround: eight hours. What questions are you going to pay attention to, what profile of specialists will it be useful for?

Sveta: It will be useful to those who have a problem with the performance of MySQL , and who want to do something to speed up MySQL. At the master class, starting with the basics, I will tell you what you need to know before debugging, how to figure out what works slowly. What is a test server , how to use it to debug performance problems. There will be four main parts: slow queries and their optimization ; blocking . , MySQL.

-, , DBA . - . , , - . – , , . , MySQL Command-Line , Linux. , , .

? - : SQL- , . . .

PG Day: . , , , , . , ?

: , MySQL . — « MySQL », Oracle MySQL. . — Percona Database Performance Blog . Percona. , Percona . Oracle, Percona Blog, .

, “MySQL Troubleshooting” . “High Performance MySQL” , . , — “MySQL High Availabilityˮ . — MySQL replication team : , Mats Kindahl, Lars Thalmann. Mats MySQL. MySQL utilities . — , Replication Team. , : , «», . . , MySQL Fabric .

“SQL Antipatternsˮ , — Bill Karwin , , MySQL. . , , .

Source: https://habr.com/ru/post/330520/

All Articles

“If you want your bugs to be fixed, write the most accurate test cases” - Sveta Smirnova about searching for MySQL bugs

More articles: