As an extension of the index in InnoDB tables surprisingly reduces performance

One type of optimization that we often use is an extension of the index. It is suitable if there are queries using other parts of the composite key. And it's usually safe, but there are some cases where performance drops dramatically.

Consider an example.

We have a table

 CREATE TABLE `idxitest` (
 `id` int (10) UNSIGNED NOT NULL AUTO_INCREMENT,
 `a` int (11) NOT NULL,
 `b` int (11) NOT NULL,
 PRIMARY KEY (`id`),
 KEY `a` (` a`)
 ) ENGINE = InnoDB AUTO_INCREMENT = 6029313 DEFAULT CHARSET = latin1

 mysql> SELECT count (*) FROM idxitest WHERE a = 5 AND b = 5;
 + ---------- +
 |  count (*) |
 + ---------- +
 |  60434 |
 + ---------- +
 1 row IN SET (0.69 sec)
 
 mysql> EXPLAIN SELECT count (*) FROM idxitest WHERE a = 5 AND b = 5;
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------ + --------- + ------- + -------- + ------------ - +
 |  id |  select_type |  TABLE |  type |  possible_keys |  KEY |  key_len |  ref |  rows |  Extra |
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------ + --------- + ------- + -------- + ------------ - +
 |  1 |  SIMPLE |  idxitest |  ref |  a |  a |  4 |  const |  707820 |  USING WHERE |
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------ + --------- + ------- + -------- + ------------ - +
 1 row IN SET (0.00 sec)

')
The most obvious solution is to expand the right key from column “a” to two columns (a, b). It seems that this should not adversely affect other requests.

 mysql> ALTER TABLE idxitest DROP KEY a, ADD KEY (a, b);
 Query OK, 0 rows affected (24.84 sec)
 Records: 0 Duplicates: 0 Warnings: 0

 mysql> SELECT count (*) FROM idxitest WHERE a = 5 AND b = 5;
 + ---------- +
  |  count (*) |
 + ---------- +
  |  60434 |
 + ---------- +
 1 row IN SET (0.02 sec)
       
 mysql> EXPLAIN SELECT count (*) FROM idxitest WHERE a = 5 AND b = 5;
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------ + --------- + ------------- + -------- + ------ ------- +
  |  id |  select_type |  TABLE |  type |  possible_keys |  KEY |  key_len |  ref |  rows |  Extra |
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------ + --------- + ------------- + -------- + ------ ------- +
  |  1 |  SIMPLE |  idxitest |  ref |  a |  a |  8 |  const, const |  120640 |  USING INDEX |
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------ + --------- + ------------- + -------- + ------ ------- +
 1 row IN SET (0.00 sec)

Cool. The request is 30 times faster. First, it scans fewer rows, and second, it uses a covering index, that is, it extracts all the necessary data directly from the index.

However, early to rejoice. There is another request in the application that was so fast before that it was difficult to notice. Now it has become much slower.

It was

 mysql> SELECT * FROM idxitest WHERE a = 100 ORDER BY id DESC LIMIT 1;
 + --------- + ----- + --- +
 |  id |  a |  b 
 + --------- + ----- + --- +
 |  3000000 |  100 |  7 
 + --------- + ----- + --- +
 1 row IN SET (0.00 sec)
 mysql> EXPLAIN SELECT * FROM idxitest WHERE a = 100 ORDER BY id DESC LIMIT 1;
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------ + --------- + ------- + -------- + ------------ - +
 |  id |  select_type |  TABLE |  type |  possible_keys |  KEY |  key_len |  ref |  rows |  Extra	   
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------ + --------- + ------- + -------- + ------------ - +
 |  1 |  SIMPLE |  idxitest |  ref |  a |  a |  4 |  const |  126074 |  USING WHERE 
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------ + --------- + ------- + -------- + ------------ - +
 1 row IN SET (0.00 sec)

It became

 mysql> SELECT * FROM idxitest WHERE a = 100 ORDER BY id DESC LIMIT 1;
 + --------- + ----- + --- +
 |  id |  a |  b 
 + --------- + ----- + --- +
 |  3000000 |  100 |  7 
 + --------- + ----- + --- +
 1 row IN SET (1.01 sec)
 mysql> EXPLAIN SELECT * FROM idxitest WHERE a = 100 ORDER BY id DESC LIMIT 1;
 + ---- + ------------- + ---------- + ------- + ----------- ---- + --------- + --------- + ------ + ------ + ----------- - +
 |  id |  select_type |  TABLE |  type |  possible_keys |  KEY |  key_len |  ref |  rows |  Extra	   
 + ---- + ------------- + ---------- + ------- + ----------- ---- + --------- + --------- + ------ + ------ + ----------- - +
 |  1 |  SIMPLE |  idxitest |  INDEX |  a |  PRIMARY |  4 |  NULL |  36 |  USING WHERE 
 + ---- + ------------- + ---------- + ------- + ----------- ---- + --------- + --------- + ------ + ------ + ----------- - +
 1 row IN SET (0.00 sec)
 # The plan also can look something like this:
 mysql> EXPLAIN SELECT * FROM idxitest WHERE a = 100 ORDER BY id DESC LIMIT 1;
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------ + --------- + ------- + ------ + -------------- ---------------------------- +
 |  id |  select_type |  TABLE |  type |  possible_keys |  KEY |  key_len |  ref |  rows |  Extra
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------ + --------- + ------- + ------ + -------------- ---------------------------- +
 |  1 |  SIMPLE |  idxitest |  ref |  a |  a |  4 |  const |  1 |  USING WHERE;  USING INDEX;  USING filesort 
 + ---- + ------------- + ---------- + ------ + ------------ --- + ------ + --------- + ------- + ------ + -------------- ---------------------------- +
 1 row IN SET (0.01 sec)

Why is this query slower? The reason is that his execution plan benefited from one InnoDB feature — all index entries are always sorted by primary key. That is, when you have an index (a) and primary key id, then you have the real key (a, id). When we expand the key to (a, b), we actually get (a, b, id). Our query used both parts of the source key “a” and “id” (for sorting). Now he can not fully use the new index.

What is the solution to the problem? We'll have to create a redundant index and keep (a) and (a, b) at the same time. Usually this should work, although not always.

 CREATE TABLE `idxitest` (
   `id` int (10) UNSIGNED NOT NULL AUTO_INCREMENT,
   `a` int (11) NOT NULL,
   `b` int (11) NOT NULL,
   PRIMARY KEY (`id`),
   KEY `a` (` a`),
   KEY `a_2` (` a`, `b`)
 ) ENGINE = InnoDB AUTO_INCREMENT = 6029313 DEFAULT CHARSET = latin1
 mysql> SELECT * FROM idxitest WHERE a = 100 ORDER BY id DESC LIMIT 1;
 + --------- + ----- + --- +
 |  id |  a |  b 
 + --------- + ----- + --- +
 |  3000000 |  100 |  7 
 + --------- + ----- + --- +
 1 row IN SET (1.03 sec)
 mysql> EXPLAIN SELECT * FROM idxitest WHERE a = 100 ORDER BY id DESC LIMIT 1;
 + ---- + ------------- + ---------- + ------- + ----------- ---- + --------- + --------- + ------ + ------ + ----------- - +
 |  id |  select_type |  TABLE |  type |  possible_keys |  KEY |  key_len |  ref |  rows |  Extra	   
 + ---- + ------------- + ---------- + ------- + ----------- ---- + --------- + --------- + ------ + ------ + ----------- - +
 |  1 |  SIMPLE |  idxitest |  INDEX |  a, a_2 |  PRIMARY |  4 |  NULL |  2247 |  USING WHERE 
 + ---- + ------------- + ---------- + ------- + ----------- ---- + --------- + --------- + ------ + ------ + ----------- - +
 1 row IN SET (0.00 sec)

The query optimizer considers the possibility of using both indices and eventually rejects both, doing a full index scan before finding a = 100. It looks like the optimizer is buggy. It is going to scan 2247 lines, whereas one line is enough using the index (a).

Therefore, you will have to force the optimizer to choose the correct query execution plan by adding FORCE INDEX (a).

As this example showed, you have to be very careful when playing with indices. If you have queries whose execution plan benefits from sorting InnoDB by the primary key, then they may be significantly affected.

The optimizer may behave differently in different versions of MySQL. We checked in version 5.1.45, although the same behavior was found in 5.0.

Source: https://habr.com/ru/post/99310/

All Articles

As an extension of the index in InnoDB tables surprisingly reduces performance

More articles: