📜 ⬆️ ⬇️

Optimization ORDER BY - what many people forget

Much has been written on the topic of optimizing MySQL queries, everyone knows how to optimize SELECT, INSERT, what needs to be jointed by key, etc. etc.

But there is one moment, also repeatedly described in all manuals, but for some reason everyone forgets about it.

Optimization of ORDER BY in queries with joins.


Justification: I used the search, I did not find it!

Most believe that if ORDER BY occurs by index, then there are no problems, but this is not always the case. Recently, I dealt with a single query that wildly braked the base, although it seemed like all the indices were in the right places. ORDER BY was the last place I poked, and the problem was there.
')
A small excerpt from the optimization manuals:

===
How MySQL Optimizes ORDER BY
The following are some cases where MySQL can not use indexes to perform ORDER BY
...
Several tables are linked, and columns are made
sorting ORDER BY, refer not only to the first non-constant
(const) table used to fetch rows (this is the first table
in the EXPLAIN output, which does not use the constant, const, string fetching method).
...
===

For ORDER BY it is important that the table according to which the sorting will be performed comes first. However, by default, in whatever order you would not join the tables, the optimizer built into mysql will rearrange them in the order that it considers necessary. That is, if you put the desired table first in the query, this does not mean that it will actually be the first.

Fortunately, the mysql optimizer can be told to join the tables in the order we specified, for this you need to add the command STRAIGHT_JOIN to the SELECT:

SELECT STRAIGHT_JOIN ... FROM table JOIN ... ... ORDER BY table.row

Checking on the mysql database of the PHPBB3 forum containing about 300,000 posts:

SELECT t.*, p.*, u.username FROM phpbb3_topics as t, phpbb3_posts as p, phpbb3_users as u WHERE t.topic_replies>0 AND p.poster_id=u.user_id AND topic_first_post_id<>p.post_id AND topic_approved=1 AND p.topic_id=t.topic_id AND t.forum_id='16' AND p.post_id<'244103' ORDER by post_id desc LIMIT 40 


Query took 12.2571 sec

in explain we see the terrible: Using where; Using temporary; Using filesort

Changing the order of the tables (the cache of the muscle is reset by the reboot):

 SELECT STRAIGHT_JOIN t.*, p.*, u.username FROM phpbb3_posts as p, phpbb3_topics as t, phpbb3_users as u WHERE t.topic_replies>0 AND p.poster_id=u.user_id AND topic_first_post_id<>p.post_id AND topic_approved=1 AND p.topic_id=t.topic_id AND t.forum_id='13' AND p.post_id<'234103' ORDER by post_id desc LIMIT 40 


Query took 0.0447 sec

in explain: Using where;

With such a forced rearrangement of the tables, we accelerated the execution of the query 300 times !

This does not mean that you should always use STRAIGHT_JOIN and follow the order of the tables yourself. But in some cases it is necessary.

PS This request is used by Yandex for indexing phpbb forums. Before optimization, the Yandex bot put the server php.ru every night for several hours (the server is not very powerful). Yandex’s blog had a discussion on this topic, but it was closed a couple of years ago and the decision was not voiced there.

Source: https://habr.com/ru/post/138163/


All Articles