📜 ⬆️ ⬇️

Clustered index in InnoDB and query optimization

Recently, networks often write about clustered index in InnoDB and MySQL tables, but, despite this, they are used in practice quite rarely.
In this article, we will show two real-world examples of how we have optimized quite complex Badoo systems, based on an understanding of how clustered index works.

Clustered index - the form of organization of the table in the file. In InDOB data is stored in a tree, in the same one in which the usual B-TREE keys lie. The InnoDB table itself is already a big B-TREE. The key values ​​are clustered index. According to the documentation , PRIMARY KEY is selected as clustered index. If PRIMARY KEY is absent - the first UNIQUE KEY is selected. If this is not the case, the internal 6-byte code is used.

What follows from this organization of data on disk?
  1. Inserting in the middle of the table may be slow due to the need to rebuild the key.
  2. Updating the clustered index value of a string results in the physical transfer of information on disk or in its fragmentation.
  3. The need to use the ever-increasing value of clustered index for quick insertion into the table. The most optimal is the auto-increment field.
  4. Each row has a unique identifier-value clustered index.
  5. Secondary keys simply refer to these unique identifiers.
  6. In fact, a secondary key of the form KEY `key` (a, b, c) will have the structure KEY` key` (a, b, c, clustered_index).
  7. The data on the disk is ordered by clustered index (we do not consider the example of SSD).
Read more about this in the MySQL manual .
')
We will talk about two types of optimization, which helped significantly speed up the work of our scripts.

Test environment


To reduce the impact of caching on the results of the study, add SQL_NO_CACHE to the samples, and also reset the file system cache before each query. And, because we are interested in the worst case when the data must actually be pulled from the disk, we will restart MySQL before each query.

Equipment:The scripts we used can be taken on GitHub .

Optimization of deep offset


For example, take the abstract table of messages, which contains user correspondence.

  CREATE TABLE messages ( 
             message_id int not null auto_increment, 
             user1 int not null, 
             user2 int not null, 
             ts timestamp not null default current_timestamp, 
             body longtext not null, 
             PRIMARY KEY (message_id), 
             KEY (user1, user2, ts) 
         ) ENGINE = InnoDB 

Consider this table in light of the listed features of InnoDB.
The clustered index here coincides with the PRIMARY KEY and is an auto-increment field. Each line has a 4-byte identifier. Inserting new rows into the table is optimal. The secondary key is actually KEY (user1, user2, ts, message_id), and we will use it.

Add 100 million messages to our table. This is quite enough to reveal the necessary features of InnoDB. There are only 10 users in our system, so each pair of interlocutors will have an average of one million messages.

Suppose that these 10 test users have exchanged many messages and often re-read old correspondence - the interface allows you to switch to a page with very old messages. And behind this interface is a simple query:

SELECT * FROM messages WHERE user1=1 and user2=2 order by ts limit 20 offset PageNumber*20

The most common, in fact, request. Let's look at the time of its execution depending on the depth of the offset:
offsetexecution time (ms)
100311
1000907
50003372
10,0006176
20,00011901
30,00017057
40,00021997
50,00028268
60,00032805


Surely many expected to see linear growth. But getting 33 seconds on 60 thousand records is too much! Explaining what takes so much time is quite simple - just mention one of the features of the MySQL implementation. The fact is that MySQL for this query reads offset + limit strings from the disk and returns limit from them. Now the situation is clear: all this time MySQL has been reading from the disk 60 thousand lines that we do not need. What to do in a similar situation? Begs many different solutions. Here, by the way, is an interesting article about these options.

We found a very simple solution: the first query selected only the clustered index values, and then chose only one of them. We know that at the end of the secondary key there is a clustered index value, so if we replace message_id in a query *, we get a query that works only by key, respectively, the speed of such a query is high.

It was:
  mysql> explain select * where messages user1 = 1 and user2 = 2 order by ts limit 20 offset 20000;
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------- + --------- + ------------- + -------- + ----- -------- +
 |  id |  select_type |  table |  type |  possible_keys |  key |  key_len |  ref |  rows |  Extra |
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------- + --------- + ------------- + -------- + ----- -------- +
 |  1 |  SIMPLE |  messages |  ref |  user1 |  user1 |  8 |  const, const |  210122 |  Using where |
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------- + --------- + ------------- + -------- + ----- -------- +
 1 row in set (0.00 sec) 


It became:
  mysql> explain select message_id from messages where user1 = 1 and user2 = 2 order by ts limit 20 offset 20000;
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------- + --------- + ------------- + -------- + ----- --------------------- +
 |  id |  select_type |  table |  type |  possible_keys |  key |  key_len |  ref |  rows |  Extra |
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------- + --------- + ------------- + -------- + ----- --------------------- +
 |  1 |  SIMPLE |  messages |  ref |  user1 |  user1 |  8 |  const, const |  210122 |  Using where;  Using index |
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------- + --------- + ------------- + -------- + ----- --------------------- +
 1 row in set (0.00 sec) 

Using index in this case means that MySQL will be able to get all the data from the secondary key, and will not access clustered index. Read more about this here .

Now it only remains to retrieve the string values ​​directly
SELECT * FROM messages WHERE message_id IN (....)

Let's see how productive this solution is:
offsetexecution time (ms)
100243
1000164
5000213
10,000337
20,000618
30,000756
40,000971
50,0001225
60,0001477


The achieved result suited everyone, so it was decided not to conduct further searches. In addition, it is not known whether it is possible to access this data in a fundamentally faster way without changing the procedure of working with history. It should be noted once again that the task was to optimize a specific query, and not the data structure itself.

Optimization of large table update procedure


The second need for optimization arose when we needed to collect relevant data about our users once a day in one large table. We had 130 million users by that time. The script, bypassing all our bases and collecting new data, works out in half an hour and selects 30 million changed lines. The result of the script is tens of thousands of text files with serialized new values ​​on the hard disk. Each file contains information about hundreds of users.

We transfer information from these text files to the database. We read the files sequentially, group the lines in batches of several thousand and update. Script execution time ranges from 3 to 20 hours. Naturally, this script behavior is unacceptable. Moreover, it is obvious that the process needs to be optimized.

The first thing suspicion fell on is the “parasitic” load on the database server disk. But numerous observations did not reveal evidence of this hypothesis. We concluded that the problem is hidden in the depths of the database and we need to think about how to fix it. How does the data lie on the disk? What do OS, MySQL and hardware have to do to update this data? While we were answering these questions, we noticed that the data is updated in the same order in which they were collected. This means that each query updates a completely random place in this large table, which entails a loss of time for positioning the disk heads, a loss of the file system cache and a loss of the database cache.

Note that the process of updating each row in MySQL consists of three stages: reading the values, comparing the old and the new values, writing the values. This can even be seen from the fact that, as a result of the query, MySQL answers how many rows matched and how many were actually updated.

Then we looked at how many rows actually changed in the table. Of the 30 million lines, only 3 million have changed (which is logical, since the table contains very limited user information). And this means that 90% of the time MySQL spends it on reading and not on updating. The solution came by itself: you should check whether the random access to clustered index loses sequentially. The result can be summarized in the case of updating the table (before it is updated, reading and comparison still takes place).

The technique is extremely simple - measure the difference in the speed of the query
SELECT * FROM messages where message_id in ($values)
where as values ​​to transfer an array of 10K elements. The values ​​of the elements to make completely random to check for random access. To test the sequential access, it is necessary to do 10K elements sequentially, starting with a certain random offset.

  function getValuesForRandomAccess () { 
     $ arr = array (); 
     foreach (range (1, 10000) as $ i) { 
         $ arr [] = rand (1,100000000); 
     } 
     return $ arr; 
 } 

 function getValuesForSequencialAccess () { 
     $ r = rand (1, 100000000-10000); 
     return range ($ r, $ r + 10000); 
 } 

Query execution time with random access and sequential:
Nrandomsequential
one38494171
240409141
340868147
four37161138
five38189137
636930134
737398176
eight38035144
939722140
ten40720146

As you can see, the difference in runtime is 200 times. Therefore, it is necessary to fight for it. To optimize execution, you need to sort the source data by the primary key. Can we quickly sort 30 million values ​​in files? The answer is simple - we can!

After this optimization, the running time of the script was reduced to 2.5 hours. It takes 30 minutes to pre-sort 30 million lines (and most of the time is gzip).

Same optimization, but on SSD


After writing this article, we found one extra SSD, on which we also tested.

Sampling with a deep offset:
offsetexecution time (ms)
100117
1000406
50001681
10,0003322
20,0006561
30,0009754
40,00013039
50,00016293
60,00019472

Optimized sampling with a deep offset:
offsetexecution time (ms)
100101
100021
500024
10,00032
20,00047
30,00094
40,00084
50,00095
60,000120

Comparison of random and sequential access:
Nrandomsequential
one5321118
25583118
35881117
four6167117
five6349120
66402126
76516125
eight6342124
96092118
ten5986120

These figures show that SSD, of course, has an advantage over a regular drive, but its use does not eliminate the need for optimization.

And in conclusion of our article, we can say that we managed to increase the speed of data sampling by 20 times. We accelerated the actual update of the table up to 10 times (the surrogate test was accelerated 200 times). Recall that the experiments were conducted on a system with disabled caching. The gain on the real system turned out to be less impressive (the cache still corrects the situation quite well).

The conclusion from the foregoing lies on the surface: it is not enough to know the strengths and weaknesses of the software with which you work, it is important to be able to apply this knowledge in practice. Knowing the internal structure of MySQL sometimes allows you to speed up queries tenfold.

Alexey alexxz Eremihin, Badoo developer

Source: https://habr.com/ru/post/135966/


All Articles