📜 ⬆️ ⬇️

Restoring damaged Innodb tables

Suppose you are working with MySQL Innodb tables, and one fine not the best moment is failing hardware, a driver, a kernel crash, electricity is turned off, or one of the rare errors in MySQL happens. At the exit we get damage to some pages in the Innodb tablespace.

In one of the previous articles, in the comments, we were asked what could be done in such a situation. We will try to answer the most concise and to the point.

So, now we are talking about a situation like this:
InnoDB: Database page corruption on disk or a failed InnoDB: file read of page 7. InnoDB: You may have to recover from a backup. 080703 23:46:16 InnoDB: Page dump in ascii and hex (16384 bytes): … A LOT OF HEX AND BINARY DATA… 080703 23:46:16 InnoDB: Page checksum 587461377, prior-to-4.0.14-form checksum 772331632 InnoDB: stored checksum 2287785129, prior-to-4.0.14-form stored checksum 772331632 InnoDB: Page lsn 24 1487506025, low 4 bytes of lsn at page end 1487506025 InnoDB: Page number (if stored to page already) 7, InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 6353 InnoDB: Page may be an index page where index id is 0 25556 InnoDB: (index “PRIMARY” of table “test”.”test”) InnoDB: Database page corruption on disk or a failed 


What to do to restore the table? In principle, there may be several types of damage, below we will consider one of the most common points. Namely - when the page is damaged in a clustered primary key.
')
In the example, the test.idb file is considered, where several bytes are replaced, so the damage is fairly moderate.

However, the CHECK TABLE operation in INNODB is practically useless. For the current damaged file, we get:

 mysql> check table test; ERROR 2013 (HY000): Lost connection to MySQL server during query mysql> check table test; +-----------+-------+----------+----------+ | Table | Op | Msg_type | Msg_text | +-----------+-------+----------+----------+ | test.test | check | status | OK | +-----------+-------+----------+----------+ 1 row in set (0.69 sec) 


The first run is to check the table in the normal mode, in this case innodb just crashes if there is an error in the checksum (even if we perform CHECK). In the second case, run innodb_force_recovery = 1. And even here we get a record in the logs about the checksum mismatch, while CHECK TABLE tells us that everything is ok with the table. As you can see, CHECK TABLE can not always be trusted.

In the example, “damage” is quite small, so if we run innodb_force_recovery = 1, we get the following:

 mysql> CREATE TABLE `test2` ( -> `c` char(255) DEFAULT NULL, -> `id` int(10) unsigned NOT NULL AUTO_INCREMENT, -> PRIMARY KEY (`id`) -> ) ENGINE=MYISAM; Query OK, 0 rows affected (0.03 sec) mysql> insert into test2 select * from test; Query OK, 229376 rows affected (0.91 sec) Records: 229376 Duplicates: 0 Warnings: 0 


Now we have all the data in the MyISAM table, so all that remains is to drop the old table and convert the new one to innodb after a restart without the option innodb_force_recovery. If the old table is needed later, you can simply rename it. The second alternative is to dump MySQLDump and load the table back. In principle, it is almost the same. MyISAM is used for the reason described below.

Why not just take advantage of OPTIMIZE TABLE? This is because the work in the innodb_force_recovery mode is performed in read mode for operations with data, so you cannot insert or erase data (you can create or delete Innodb tables):

 mysql> optimize table test; +-----------+----------+----------+----------------------------------+ | Table | Op | Msg_type | Msg_text | +-----------+----------+----------+----------------------------------+ | test.test | optimize | error | Got error -1 from storage engine | | test.test | optimize | status | Operation failed | +-----------+----------+----------+----------------------------------+ 2 rows in set, 2 warnings (0.09 sec) 


It was easy, right?

After that, you can go even further and edit our test.ibd file, completely removing one of the page titles. Now CHECK TABLE will fall even when using innodb_force_recovery = 1

 080704 0:22:53 InnoDB: Assertion failure in thread 1158060352 in file btr/btr0btr.c line 3235 InnoDB: Failing assertion: page_get_n_recs(page) > 0 || (level == 0 && page_get_page_no(page) == dict_index_get_page(index)) InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even 


If we see something similar, then innodb_force_recovery will not help us, because you can only work with this in cases of data corruption in various system locations. But in our case it does not help.

We get this error:

 mysql> insert into test2 select * from test; ERROR 2013 (HY000): Lost connection to MySQL server during query 


Attempts to use automatic data recovery processes do not lead to a positive result. Therefore it is worth using a series of commands with LIMIT for manual recovery mode:

 mysql> insert ignore into test2 select * from test limit 10; Query OK, 10 rows affected (0.00 sec) Records: 10 Duplicates: 0 Warnings: 0 mysql> insert ignore into test2 select * from test limit 20; Query OK, 10 rows affected (0.00 sec) Records: 20 Duplicates: 10 Warnings: 0 mysql> insert ignore into test2 select * from test limit 100; Query OK, 80 rows affected (0.00 sec) Records: 100 Duplicates: 20 Warnings: 0 mysql> insert ignore into test2 select * from test limit 200; Query OK, 100 rows affected (1.47 sec) Records: 200 Duplicates: 100 Warnings: 0 mysql> insert ignore into test2 select * from test limit 300; ERROR 2013 (HY000): Lost connection to MySQL server during query 


Here, the rows from the table are translated into a new table, until we get to the row, which causes MySQL to drop. We can expect this on a line between 200 and 300, so it’s worth using a series of similar actions to solve a problem.

Now we have found the corrupted data in the table, while it is worth using max PK, and check other values:
 mysql> select max(id) from test2; +---------+ | max(id) | +---------+ | 220 | +---------+ 1 row in set (0.00 sec) mysql> insert ignore into test2 select * from test where id>250; ERROR 2013 (HY000): Lost connection to MySQL server during query mysql> insert ignore into test2 select * from test where id>300; Query OK, 573140 rows affected (7.79 sec) Records: 573140 Duplicates: 0 Warnings: 0 


So, we try to skip 30 lines, but this is not enough. We skip 80 lines, and now everything is fine. Using “binary search” we can figure out how many lines to skip, in order to recover the maximum amount of corrupted data. The size of the string can help. So, we have 280 bytes per line, so we get about 50 lines per page. And here 30 lines are not enough - if the page table is damaged, you need to skip at least the entire page. If the page is damaged at a higher BTREE level, you need to skip more pages to use this recovery method.

In some cases, for example, when the root page for a clustered index is damaged, this method will not work normally. In this case, it is worth using Innodb Recovery Toolkit .

PS We accept applications for articles :)

Source: https://habr.com/ru/post/257627/


All Articles