CentOS6 Migration from raid1 to raid10

Good day.
There is a server with CentOS6 and high% iowait. It is necessary without downtime, or a maximum of 10 minutes, preferably at night, to transfer the system from md raid1 to md raid10. In my case, it's really possible to keep within 10 minutes, or even less, because the server has hot-swap baskets, but the engineers of the data center where the server is rented, initially connected the disks to the second and third baskets, and not the first and second. Because of this, we had to turn off the server, switch the old disks and change the new two.

Before starting, I will mention one detail; without it, it is impossible to migrate by following this topic. The entire system, except / boot, must be installed on LVM. The / boot mount point is on the md0 partition, because grub 0.97 does not know how to boot from LVM.

So, the server is loaded and it has four disks. The / dev / md0 partition is / boot, the / dev / md1 partition is an LVM Physical device and there is a system on it.

 # ls -l / dev / vd *
 brw-rw ----.  1 root disk 252, 0 Mar 23 22:34 / dev / vda
 brw-rw ----.  1 root disk 252, 1 Mar 23 22:34 / dev / vda1
 brw-rw ----.  1 root disk 252, 2 Mar 23 22:34 / dev / vda2
 brw-rw ----.  1 root disk 252, 16 Mar 23 22:34 / dev / vdb
 brw-rw ----.  1 root disk 252, 17 Mar 23 22:34 / dev / vdb1
 brw-rw ----.  1 root disk 252, 18 Mar 23 22:34 / dev / vdb2
 brw-rw ----.  1 root disk 252, 32 Mar 23 22:34 / dev / vdc
 brw-rw ----.  1 root disk 252, 48 Mar 23 22:34 / dev / vdd

 #cat / proc / mdstat
 Personalities: [raid1]
 md0: active raid1 vda1 [0] vdb1 [1]
       204788 blocks super 1.0 [2/2] [UU]

 md1: active raid1 vdb2 [1] vda2 [0]
       5937144 blocks super 1.1 [2/2] [UU]
       bitmap: 1/1 pages [4KB], 65536KB chunk

 # df -h
 Filesystem Size Used Avail Use% Mounted on
 / dev / mapper / vg_vmraid10-root 2.0G 591M 1.3G 32% /
 tmpfs 376M 0 376M 0% / dev / shm
 / dev / md0 194M 33M 151M 18% / boot
 / dev / mapper / vg_vmraid10-part 1008M 34M 924M 4% / mnt / part

In the example, the disks appear as / dev / vdX, because For example, we used a virtual machine using para-virtualized VirtIO disk drivers.
First, extract / dev / vdb2 from / dev / md1

  # mdadm / dev / md1 -f / dev / vdb2
 mdadm: set / dev / vdb2 faulty in / dev / md1
 # mdadm / dev / md1 -r / dev / vdb2
 mdadm: hot removed / dev / vdb2 from / dev / md1

Next, we need to wipe the super block of the section and score with zeros a small part of the section, if this is not done, then the data on / dev / md1 and the partition with raid10, as I understand it, will be consistents because of this there will be problems with creating a partition with raid10 (mdadm will swear that the / dev / vdb2 partition is already used in / dev / md1, despite the fact that we removed it earlier) and after rebooting, the system will try to boot not from / dev / md1, but c / dev / md2 and it will end on kernel panic.

  # dd if = / dev / zero of = / dev / vdb2 bs = 512 count = 1
 # dd if = / dev / zero of = / dev / vdb2 bs = 1M count = 100

I know that you can do only the second team, but initially I started to do so, so I did not dare to experiment on a real machine.
')
Next we need to copy the partition table from / dev / vdb to / dev / vdc and / dev / vdd. Let's use the sfdisk utility. sfdisk will swear and say that I will not do anything because the partition does not start at the boundary of the cylinder; add the -f switch to it.

  # sfdisk -d / dev / vdb |  sfdisk -f / dev / vdc
 # sfdisk -d / dev / vdb |  sfdisk -f / dev / vdd

Sections are ready, it's time to create a new raid10 in degraded mode. I specify my uuid also for insurance, since during the experiments there were problems.

  # mdadm --create / dev / md2 --uuid = 3846bee5: d9317441: f8fb6391: 4c024445 --level = 10 --raid-devices = 4 --chunk = 2048 missing / dev / vd [bcd] 2
 mdadm: Defaulting to version 1.2 metadata
 mdadm: array / dev / md2 started.

Add a line with a new section to /etc/mdadm.conf

  # cat /etc/mdadm.conf 
 # mdadm.conf written out by anaconda
 MAILADDR root
 AUTO + imsm + 1.x-all
 ARRAY / dev / md0 level = raid1 num-devices = 2 UUID = 7872830a: c480f8c4: ac316f53: c6ea2b52
 ARRAY / dev / md1 level = raid1 num-devices = 2 UUID = 3846bee5: d9317441: f8fb6391: c4024454
 ARRAY / dev / md2 level = raid10 num-devices = 4 UUID = 3846bee5: d9317441: f8fb6391: 4c024445

Reboot, when loading we see that the / dev / md2 partition is loaded in degraded mode

  md / raid10: md2: active with 3 out of 4 devices

We create a physical volume from the newly created section, and also indicate my uuid to avoid duplicate uuid.

  # pvcreate --uuid I0OAVm-27U4-KFWZ-4lMB-F3r9-X2kx-LnWADB --norestorefile / dev / md2
   Writing physical volume data to disk "/ dev / md2"
   Physical volume "/ dev / md2" successfully created

Now we increase the volume group vg_vmraid10

  # vgextend vg_vmraid10 / dev / md2
   Volume group "vg_vmraid10" successfully extended

In LVM on Physical volume data is stored in blocks called PhysicalExtent (PE), these PE's can be moved between Physical volume, which is what we need to do now.

  # pvmove / dev / md1 / dev / md2
 ...
   / dev / md1: Moved: 100.0%

Now we need to open the /boot/grub/menu.lst file and correct the rd_MD_UUID parameter on the uuid of the / dev / md2 partition in my case, this is 3846bee5: d9317441: f8fb6391: 4c024445 in the kernel command line, if the system does not, then the system will try to find The root partition on / dev / md1, and that is already empty. Also in this line, it is desirable to add the useful, when working remotely, parameter panic = 10, which means when kernel panic to do auto-search after 10 seconds. Next we need to restart the server.

  # reboot

And here we are waiting for one gift, so far I have not figured out why this happens. After a reboot, the / dev / md2 partition is renamed to the / dev / md127 partition, but the system loads normally by uuid. As far as I know, this refers to the new version of mdadm - 3.0 and higher, where you can create a partitionable array in which you can name the md partitions. If anyone knows why this happens, then please write in the comments. In the meantime, I just have to fix the partition number in the /etc/mdadm.conf file and delete the line responsible for / dev / md1.
Remove from Physical group / dev / md1 and remove it from the Physical device list. Stop / dev / md1, overwrite the super block at / dev / vda2 and add to / dev / md127

  # vgreduce vg_vmraid10 / dev / md1
   Removed "/ dev / md1" from volume group "vg_vmraid10"
 # pvremove / dev / md1
   Labels on physical volume "/ dev / md1" successfully wiped
 # mdadm -S / dev / md1
 mdadm: stopped / dev / md1
 # dd if = / dev / zero of = / dev / vda2 bs = 512 count = 1
 # mdadm / dev / md127 -a / dev / vda2
 mdadm: added / dev / vda2
 # cat / proc / mdstat
 Personalities: [raid10] [raid1]
 md0: active raid1 vda1 [0] vdb1 [1]
       204788 blocks super 1.0 [2/2] [UU]

 md127: active raid10 vda2 [4] vdb2 [1] vdc2 [2] vdd2 [3]
       11870208 blocks super 1.2 2048K chunks 2 near-copies [4/3] [_UUU]
       [> ....................] recovery = 1.1% (68352/5935104) finish = 2.8min speed = 34176K / sec

Everything, the system is migrated to raid10.
This should work and on CentOS5 only there you will not have to change the uuid in menu.lst, since there is no such parameter. In debian squeezy, the difference will be only in grub, because there is grub2.
In conclusion, I concluded for myself that it is always better to use LVM, since working with disks becomes easier at times. I can enlarge partitions, transfer data from one physical disk to another in transparent mode, I can make snapshots, etc.
My first article, perhaps it turned out too long, I wanted to describe everything in detail so that it was clear.
Thank you if you have read to the end.

Source: https://habr.com/ru/post/140817/

All Articles

CentOS6 Migration from raid1 to raid10

More articles: