Software RAID-6 under Linux: 16TB array recovery experience

A few days ago one of the hard drives on a budget array of 16x1TB drives failed. Array level: RAID 6. The situation was complicated by the fact that (as it turned out) a cooler was also installed on the video card of the same server, which was not noticed before, and after replacing the HDD, as a result of a change in the cooling mode of the case - this began to manifest itself in the form of synchronization time, which in itself is very unpleasant. This resulted in the fact that the array ceased to auto-assemble, and several more disks were marked as failed, and I had to deal with it seriously, smoking wikis, manuals and forums (forums are the most useful because they describe the experience of specific people in specific situations) .

The structure of my array:

16x1TB HDD

sections:
md0 - / root 8x1 GB, RAID 6
md1 - / data: 16x999GB, RAID 6
')
At first, all the assembly experiments were set up on md0, i.e., on the rudimentary partition, which in itself does not have much value, except that it is a tuned system.

So, I booted from Debian Disc 1 in Resque mode

Attempting to auto-array through

 mdadm --assemble --scan

led to the conclusion of the error "not enough disks to build an array."

Continuing on science:

1. It is necessary to preserve the description information for arrays that contain information which disk specifically is which number in the array. In case you have to collect "dangerous methods":

 mdadm --examine / dev / sd [abcdefghijklmnop] 1> / mnt / raid_layout1
 mdadm --examine / dev / sd [abcdefghijklmnop] 2> / mnt / raid_layout2

These files contain something similar to the one below for all HDDs that have a superblock on the sdX1 partition (in my case only 8 of 16 for md0 have a superblock on sdX1)

Below is an example of the output of one of the sections:

 / dev / sdp1:
         Version: 0.90.00
      Raid Level: raid6
   Used Dev Size: 975360 (952.66 MiB 998.77 MB)
    Raid Devices: 8
   Total Devices: 8

       Number Major Minor RaidDevice State
 this 4 8 177 4 active sync / dev / sdl1

    0 0 8 97 0 active sync / dev / sdg1
    1 1 8 113 1 active sync / dev / sdh1
    2 2 8 129 2 active sync / dev / sdi1
    3 3 8 145 3 active sync / dev / sdj1
    4 4 8 177 4 active sync / dev / sdl1
    5 5 8 193 5 active sync / dev / sdm1
    6 6 8 209 6 active sync / dev / sdn1
    7 7 8 225 7 active sync / dev / sdo1

Briefly about what this means:

sdf2 - current parsed partition
Version 0.90.00 - Superblock Version
You will also see a bunch of useful information - the size of the array, UUID, Level, Size of the array, Number of devices, etc.

But the most important thing for us now is the table at the bottom of the list, the first line in it indicates what the HDD in the array is in our array:

 this 4 8 177 4 active sync / dev / sdl1

Also pay close attention to the superblock version! In my case, it is 0.90.00.

Here we see its number in the array, i.e. 4 - you will find the same numbers in the output for all other devices from the list. Please note that the disk letter in the status line is different - sdl1 - this means that the disk was initialized on another SATA port, then moved. This is noncritical information, but may be useful.

The device name and its number in the array are critical (they will change when devices are transferred from port to port).

We save the created raid_layout file (for example, on a flash drive) so as not to get lost, and proceed to the next step:

2. Trying to build an array

You can assemble an array in 2 ways: automatic and manual.

Auto:

 mdadm --assemble --scan -v

If it is automatically assembled, consider you lucky, you just need to check whether all the HDDs in the array, and if not, add the missing ones, and then you can not read. But, in my case, the automatic build fails with the error that insufficiently working devices:

 mdadm: / dev / md2 assembled from 4 drives - not enough to start the array.

and the array was created on 4 of 8 disks. Of course, it will not work, since Raid6 allows only 2 disks to be missing.

Checking array status

 cat / proc / mdstat

 md2: inactive sdn1 [3] (S) sdk1 [7] (S) sdj1 [6] (S) sdp1 [4] (S)
       3907264 blocks

There is a subtlety - if in the HDD list there is an uninitialized or flagged as failed, then the assembly stops immediately, so the “-v” flag is useful - to see on which of the HDD the assembly got up.

Manual assembly:

 mdadm --assemble / dev / sd [abcdefgh] 1 -v

The same, but we specified specifically which HDDs to use for assembly.

Most likely the array will not build as well as in the case of the automatic assembly. But, collecting by hand, you begin to better understand the very essence of what is happening.

The array is also not collected if the disk is marked as “faulty” in the section metadata.

Here I am jumping on how I started the array with the data, because / root array I lost, why and how - described below. To build an array ignoring the “faulty” status — you can add the “-f” (force) flag — in my case, this solved the problem of assembling the main partition with data, that is, the partition was successfully rebuilt with the following command:

 mdadm --assemble / dev / md3 / dev / sd [abcdefghijklmnop] 2 -f

for sure, a simple way to assemble it would be as follows:

 mdadm —assemble —scan -f -v

But since I got to the "-f" flag through thorns, this is now understandable.
That is, sections marked as failed or obsolete were added to the array, and not ignored. It is likely that a bad or outdated partition may be marked with a bad, or not tightly fitting SATA cable, which is not uncommon.

Nevertheless, I got an array in degraded mode, on 14 disks out of 16.

Now, to restore the normal performance of the array and not be afraid for it, you need to add 2 missing disks to it:

 mdadm --add / dev / md3 / dev / sdX2

where X is the letter of the new HDD partition

Below I will cite the difficulties that I encountered in order to save others from attacking my rake:

I used WIKI recommendations - Linux RAID Recovery ( raid.wiki.kernel.org/index.php/RAID_Recovery ) for working with the Linux RAID array WIKI - I advise you to be careful with them, because the page describes the process very briefly, and thanks to these recommendations, I destroyed / root (md0) of my array.

Until this line at the very bottom of the WIKI article, everything is very useful:

 mdadm --create --assume-clean --level = 6 --did-devices = 10 / dev / md0 / dev / sdb1 / dev / sdc1 / dev / sdd1 / dev / sde1 / dev / sdf1 missing / dev / sdl1 / dev / sdk1 / dev / sdj1

This line demonstrates how to re-create an array, knowing which devices in which order it is included. It is very important to take into account the version of your superblock, since the new mdadm create a superblock 1.2 and it is located at the beginning of the section, 0.90 is located at the end. Therefore, you need to add the flag "--metadata = 0.90".
After I built the array using “--create”, the file system was destroyed, and neither the main ext4 superblock nor the backup superclock was found. At first, I discovered that the new superblock was not 0.90 and 1.2, which could have caused the partition to be destroyed, but it didn’t appear, because changing the RAID version of the superblock to 0.90 and searching the backup of the ext4 superblock was unsuccessful.
Since the / root partition is not the most important part, I decided to experiment - the array was reformatted and then stopped:
mdadm —stop / dev / md2
and immediately created again through “--create”: the result is that the file system is destroyed again, although this should not have happened, I am sure that I did not confuse the order of the devices for the first time, and especially the 2nd one.
Perhaps someone successfully restored partitions, through "--create", I will be happy to add to this article what exactly I did wrong, and why FS was destroyed. Perhaps it was also collected with other parameters of the block size (chunk size).

It is obvious that any recommendations from this article should be used at your own risk and risk, no one guarantees that in your case everything will work exactly as in mine.

Source: https://habr.com/ru/post/125596/

All Articles

Software RAID-6 under Linux: 16TB array recovery experience

More articles: