📜 ⬆️ ⬇️

Adding a disk to MDADM RAID 5/6 on the fly

Today I want to share with you a brief instruction on how to add a disk to an existing RAID 5/6 without rebuilding the array (it is often simply unrealistic to backup 4-10Tb of data) and not stopping the server for a day or so. This instruction is written for Linux softrades via mdadm, with hardware controllers / Windows, the conversation is special and often very short (nothing at all).

So let's get started (system - Ubuntu 10.04, but it should work almost everywhere).

First of all, we connect the disk and a day or two (well, or at least once a complete overwrite), we drive it under load, write / read files. It is necessary to immediately detect problems with the disk, controller, cable. If this is not done, and adding a “failed” disk to the array can be quite fun (actually, it happened to me, the SATA6 disk was not very compatible with the SATA1 controller. But thanks to the security of mdadm, data loss was avoided).

0. Check the version of mdadm - it is desirable to have the latest stable version. (in particular, in the process of writing this guide, a bug was found when continuing to work after a failure - no stripe_cache_size was installed - but this should already be fixed)
')
1. Disable write intent bitmap if you have had one (it is used to speed up rebuilding of the array in case of failure):

mdadm --grow --bitmap=none /dev/md0

2. Add disks to the array as a hot spare (hot spare) - at this stage, while nothing starts to be written on them, the disks will be used in case of failure or subsequent resizing of the array.

mdadm /dev/md0 -a /dev/sda1
mdadm /dev/md0 -a /dev/sdb1

We do cat / proc / mdstat, and we see that the disks were added as hot spare (S)

Personalities : [raid6] [raid5] [raid4]
md0 : active raid6 sdb1[7](S) sda1[6](S) sdd1[2] sdc1[4] sdh1[0] sdg1[5] sdf1[1]
2929683456 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [5/5] [UUUUU]


3. We check the UPS by pulling the socket out of the wall from the outlet (although the outlet can also be). Accumulators like to sulfatirovat over time imperceptibly for ups, and can suddenly die in a couple of seconds.

Despite the fact that having a UPS is not necessary for a safe resize, it is quieter with it. The raid array driver in the kernel keeps the resize progress constantly, and if it fails anywhere, it can continue without problems.

4. The most important team :

mdadm --grow /dev/md0 --raid-disk=8 --backup-file=/var/backup

backup-file — required to save a backup of the array data in the event of a failure at the very first stage of resizing the array. Needless to file do not need to lay on the raid itself . Similarly, RAID-5 can be made RAID-6 by specifying - level = 6 (recall that RAID-6 can withstand the death of 2 any drives, which is very important, since recovering large RAID-5 takes up to 10-20 hours and something can happen at this time ...)

The resize operation on average occurs at the speed of the slowest disk. Those. if the slowest disk presses 60Mb / s, then mdadm will need one pass through the array at such a speed. In the case of terabyte drives, this is about 5 hours, if the processor keeps up, and slower if it doesn't. If the speed is too low - do

echo "200000" > /sys/block/md0/md/sync_speed_max

In the process of expanding the array, data remains available, you can watch the progress through / proc / mdstat.

5. After the resize is completed, we expand the file system . First check

e2fsck -f /dev/md0

Then the actual extension:

resize2fs /dev/md0

6. Add back to write intent bitmap:
mdadm -G / dev / md0 --force -b / var / md0_intent --bitmap-chunk = 65536

7. Re-generate the mdadm.conf config . We carry out

mdadm --detail --scan –verbose


And the result is inserted into /etc/mdadm/mdadm.conf

8. Re-generate ramfs to use the correct mdadm config after reboot:

update-initramfs -k all –u


Is done


Now you have a bigger RAID, and you did not have to pay a lot of money to get this opportunity. Do not forget about monitoring - the more disks, the higher the risk of one of them overheating.

PP. You can do other funny things with mdadm - for example, if you have a 1TB disk array, you suddenly find 2 to 500 - you can combine them into RAID-0, and add them to the main array. And if there is 100MB of free space on these two disks, you can do it in a separate partition, merge into RAID-1, and mount / boot / there, and then the system can be completely transferred to RAID, and loaded without aids (like flash drives, old brooms and so on.).

Pps. Looking at the number of disks on the right from my home file server, I begin to think that I probably had a too small hard drive as a child ... However, it could be worse

Comments / questions - in the studio.

Source: https://habr.com/ru/post/117377/


All Articles