shared hotspare for mdadm

(he doubted to write here or in system administration)

I found that there is very little (and not very clearly) on the Internet that explains how mdadm works with shared (global) hot-swap drives. In this note, I will describe what it is and explain why shared hotspare are not marked as common in / proc / mdstat, but instead look like quite local ones.

What is hot-spare?

(I'm not writing for newbies, so galloping across Europe)
If the array is redundant and one of its disks has failed, then it is possible to restore the redundant information to the backup disk. If the disk is added to the array by hand (the administrator received a letter about failure, he read the letter, woke up / got dressed, came to work, took out the failed disk, inserted a spare, added it to the array, gave the command to restore redundancy), then this disk is called cold- spare. Just a "spare disk".
')
If the server has an idle disk, to which redundancy is restored immediately after the failure of any disk in the array, then this disk is called hot-spare. The main advantage is that it will recover (restore redundancy) even if the admin letter misses or does not have time to arrive.

Local hot-spare

Usually, a spare disk is added for the array, that is, if the array fails, then its backup disk is used. If a failure occurs in the adjacent array, then the hot-spare from the “foreign” array is not used.

It is actually logical - if we have a choice - to use hot-spare to restore the redundancy of the system or data partition, we need to restore the redundancy of the data partition. And if the system partition has already “occupied” hot-spare, then it will be a bjaka. Moreover, some manufacturers offer 1EE hotspare, in which a backup disk is used for data storage (the empty space is “smeared” between the array disks, providing the ability to quickly rebuild and increase performance in normal mode).

Global (common) hot-spare

However, it happens that there are a lot of data arrays. And they all need hot-spare disks. But the disks pity. And then there is a desire to have a “shared” disk, which can be used for any of the arrays (and even better, 2-3 such disks).

It was an introduction. Now we come to the essence of the question.

linux md

mdadm (nuclear module in the DM stack) does not support shared hot-spare. A disk can be added as hot-spare only to a specific array.

But mdadm supports!

Exactly. mdadm supports, nuclear module does not. Mdadm implements a common hot-spare method "throw hotspare from one array to another, damaged."

In other words, in order for this to work, mdadm must be running in -F (follow) mode. He usually sends messages to the mail about the problems of the raid. Most modern distributions run it (if there are arrays), but it is important to understand that it serves only those arrays that were collected from mdadm.conf, and not collected by pens. (Yes, here we are waiting for a setup).

spare-group

To be able to distribute disks between different arrays, there is the concept of a spare-group, that is, a group within which it is possible to flip disks. Such groups can be many - and hot-spare are transferred only between them.

As it is easy to understand from the above about mdadm / linux md, in / proc / mdstat there is not and can not be anything about the spare-group. Because these are personal thoughts and considerations of mdadm, and the core is not about sleep or spirit (files in / proc are created by kernel modules ...).

Thus, you can only provide shared hot-spare with mdadm. There are two options: if the group is specified for an array that is going to be loaded (/etc/mdadm/mdadm.conf), then there you can specify hot-spare, like this:

ARRAY / dev / md1 level = raid1 num-devices = 2 metadata = 1.2 spares = 1 spare-group = myhostparegroupname name = server: 1 UUID = 18219495: 03fda335: 3f1ad1ee: a5f5cd44
devices = / dev / sda, / dev / sdb, / dev / sdc
ARRAY / dev / md2 level = raid1 num-devices = 2 metadata = 1.2 spare-group = myhostparegroupname name = server: 2 UUID = 18219495: 03fda335: 3f1ad1ee: a5f5cd45
devices = / dev / sdd, / dev / sde

(I immediately answer the question where to get so many clever words - mdadm --detail --scan --verbose )

Compared with the output mdadm here only spare-group. Please note - in the second array NO hot-spare, however, because group is specified, then in case of failure a disk from another array with the same group will be used. In our case, this will be / dev / md1.

Of course, all this will happen only if we have mdadm running in the -F mode. In debian, it looks like in ps output:

 / sbin / mdadm --monitor --pid-file /var/run/mdadm/monitor.pid --daemonise --scan --syslog

There can be several groups themselves on the same system.

By the way, there is a shit here: when you call mdadm with --detail, there will be no mention of spare-groups, you will need to add it yourself.

Local && global hotspare

And here, alas, yoke. As far as I know, mdadm does not support both local (which will belong to only one array) and shared hotspare. If there are two arrays with one spare-group, then all the hot-spares from one array can be used for the benefit of the other.

The script is not as rare as it seems. Here is a simple topology:

SYS_ARRAY
DATA_ARRAY
2 hot-spare

It would be logical to make one hot-spare part owned only by DATA_ARRAY, and the second to be made general so that it can be used both as a reserve for SYS_ARRAY and as a “second level of reserve” for DATA_ARRAY.

Alas, alas, alas, this is not (if I dissuade in the comments, I will be very happy).

Source: https://habr.com/ru/post/122379/

All Articles