📜 ⬆️ ⬇️

Remote replacement of root filesystem in GNU / linux

image
Sometimes I have to deal with the replacement of the root file system. Having a boot disk and access to the server is not difficult. However, I want to share the experience of replacing the root file system remotely, via ssh, without rebooting.

The reasons for replacing a horse FS are different:


As an example, I will move the root file system from / dev / sda2 to LVM
I will carry out the description on an example of gentoo GNU / linux.
This method was tested by me also on debian GNU / linux.

Immediately, I’ll make a reservation that in spite of the fact that I have never had any data loss during this operation, nobody canceled the backup!
')

Given


There is a remote server located in the data center.
In my case, this will be the host system, running KVM virtual machines.

The hard disk / dev / sda, where the root file system is located, is divided into three msdos partitions:

Tasks


  1. Move the root file system from the / dev / sda2 partition in LVM to the “root” logical volume of the “sys” group (/ dev / mapper / sys-root)
  2. Increase the size of root FS from 2 GB to 3 GB

Before you start


- I will need the utility lsof. You must install it before starting work.
- It is necessary to understand that in the process of solving the problem we need to restart all processes on the server.
- Important! I am writing this article for those who are already familiar with LVM and understand that when booting the kernel will not be able to mount / without the help of initramfs!

Decision


The decision will be described in steps.
The plan is:
1. Create a new logical volume in LVM
2. Create folders to mount
3. We are preparing to remount the old root FS / readonly mode
4. Remount the root FS / readonly mode
5. Backup (optional)
6. Copy the root file system from the sda2 device to the LVM root volume
7. Resize FS
8. Mount a copy
9. Substitute root FS
10. Return the mount points of all other FS
11. Restart applications with new root filesystem.
12. Operations after root FS substitution

1. Create a new logical volume in LVM
The volume will be 3GB and called root
lvcreate -l 3g -n root sys
  Logical volume "root" created 

2. Create folders to mount
mkdir / mnt / oldroot / mnt / newroot

3. We are preparing to remount the old root FS / readonly mode
This is necessary in order to copy the old root file system to a new place in a consistent state.
But the FS itself is most likely being used by processes to write data.

3.1. Checking deleted files
lsof / | grep 'DEL \ | delete'
  dmeventd 3397 root txt REG 8,2 31816 71851 / sbin / dmeventd (deleted)
 dmeventd 3397 root DEL REG 8,2 87761 /lib64/libdevmapper.so.1.02|paludis-midmerge
 dmeventd 3397 root REG REG 8,2 87769 /lib64/libdevmapper-event.so.1.02|paludis-midmerge
 sshd 5601 root txt REG 8.2 482592 14452 / usr / sbin / sshd (deleted) 

If such files are found, then the processes need to restart or stop.
I have such files. They arose from the fact that the net-misc / openssh and sys-fs / lvm2 packages were recently updated.

3.2. Restart and / or stop processes with deleted files
In my case, I restart sshd and finish dmeventd
/etc/init.d/sshd restart
  * Stopping sshd ... [ok]
  * Starting sshd ... [ok] 

/etc/init.d/dmeventd stop
  * WARNING: you are stopping a boot service
  * Stopping dmeventd ... [ok] 

3.3. Make sure there are no more deleted files.
lsof / | grep 'DEL \ | delete'
Made sure.

3.4. Checking files open for writing

lsof / | grep -v '\ (mem \ | txt \ | rtd \ | cwd \)'
  COMMAND PID USER FD TYPE DEVICE SIZE / OFF NODE NAME
 cron 29035 root 3u REG 8.2 6 11427 /var/run/cron.pid
 snmpd 29530 root 3w REG 8.2 1035580 84179 /var/log/net-snmpd.log
 snmpd 29530 root 8r REG 8.2 1316 28549 / etc / mtab
 rsyslogd 29678 root 1w REG 8,2 642199 11587 / var / log / messages
 rsyslogd 29678 root 2w REG 8.2 62061 12377 / var / log / cron
 rsyslogd 29678 root 4w REG 8.2 2155 12375 / var / log / secure
 rsyslogd 29678 root 5w REG 8.2 259 12376 / var / log / maillog 

We look at the files whose opening mode (column FD) contains one of the letters: uUwW
In my case, I see no problems stopping all these services at the time of the move.
In your case, decide for yourself.

3.5. Stop the processes that keep open files.
/etc/init.d/rsyslog stop
  * Stopping rsyslogd ... [ok] 

/etc/init.d/snmpd stop
  * Stopping snmpd ... [ok] 

/etc/init.d/vixie-cron stop
  * Stopping vixie-cron ... [ok] 

3.6. Make sure there are no more open files for writing.
lsof / | grep -v '\ (mem \ | txt \ | rtd \ | cwd \)'
  COMMAND PID USER FD TYPE DEVICE SIZE / OFF NODE NAME 

3.7. Unmount all loop devices
In my case, I will unmount the squashfs file system in / usr / portage
umount / usr / portage

3.8. Unmount FS of type nfs, cifs, fuse and aufs
In my case there are none.

3.9. We look file (unix) sockets
netstat --unix -a | grep '/ \ | Path $'
  Proto RefCnt Flags Type State I-Node Path
 unix 2 [ACC] STREAM LISTENING 3682265 /var/run/kvm/kvm204.sock
 unix 2 [ACC] STREAM LISTENING 3279538 /var/run/kvm/kvm209.sock
 unix 2 [ACC] STREAM LISTENING 3389038 /var/run/kvm/kvm207.monitor.sock
 unix 2 [ACC] STREAM LISTENING 3603323 /var/run/kvm/kvm208.monitor.sock
 unix 2 [ACC] STREAM LISTENING 3279539 /var/run/kvm/kvm209.monitor.sock
 unix 2 [ACC] STREAM LISTENING 3607000 /var/run/kvm/kvm210.monitor.sock
 unix 2 [ACC] STREAM LISTENING 3612458 /var/run/kvm/kvm211.monitor.sock
 unix 2 [ACC] STREAM LISTENING 3682266 /var/run/kvm/kvm204.monitor.sock
 unix 2 [ACC] STREAM LISTENING 3612457 /var/run/kvm/kvm211.sock
 unix 2 [ACC] STREAM LISTENING 3279518 /var/run/kvm/kvm205.sock
 unix 2 [ACC] STREAM LISTENING 3603322 /var/run/kvm/kvm208.sock
 unix 2 [ACC] SEQPACKET LISTENING 3605138 @ / org / kernel / udev / udevd
 unix 2 [ACC] STREAM LISTENING 3608717 /var/run/kvm/kvm206.sock
 unix 2 [ACC] STREAM LISTENING 3280746 /var/run/kvm/kvm205.monitor.sock
 unix 2 [ACC] STREAM LISTENING 3608718 /var/run/kvm/kvm206.monitor.sock
 unix 2 [ACC] STREAM LISTENING 3606999 /var/run/kvm/kvm210.sock
 unix 2 [ACC] STREAM LISTENING 3389037 /var/run/kvm/kvm207.sock 

When migrating the root filesystem, these sockets will no longer be associated with their applications.
This is solved:
- pre-stop these applications (recommended)
- restarting these applications after replacing the root filesystem
I'm not afraid of losing connection with applications through these sockets.

4. Remount the root FS / readonly mode
mount -n -o remount, ro /
If everything is successful, the command will end silently.
If the line “mount: / is busy” appears, then the root file system is still busy. Go back to step 3 and check. Perhaps you forgot something.

I do not exclude that I, too, could not foresee something, but at this stage, if you do not succeed in this step, you cannot move further. You have not had time to change anything. Just return to work stopped processes.

If everything went well with you, then move on.

5. Backup (optional)
Now is the time for this.
For myself, I do not see the need for this, because after the entire operation, the old sda2 partition will remain as a backup.
In addition, I have set up a daily backup of all partitions of the host system and all virtual machines.

6. Copy the root file system from the sda2 device to the LVM root volume
dd if = / dev / sda2 of = / dev / sys / root bs = 8M
  239 + 1 records in
 239 + 1 records out
 2006843392 bytes (2.0 GB) copied, 58.1021 s, 34.5 MB / s 

7. Resize FS

7.1. Initially check FS for errors
fsck -fC / dev / sys / root
fsck from util-linux 2.20.1
  e2fsck 1.42 (29-Nov-2011)
 Pass 1: Checking inodes, blocks, and sizes
 Pass 2: Checking directory structure
 Pass 3: Checking directory connectivity
 Pass 4: Checking reference counts
 Pass 5: Checking group summary information
 / dev / mapper / sys-root: 46848/122640 files (2.0% non-contiguous), 216129/489952 blocks 

7.2. We make a change in the size of the FS
In our case, we increase the FS to the size of the LVM volume.
resize2fs -p / dev / sys / root
  resize2fs 1.42 (29-Nov-2011)
 Resizing the filesystem on / dev / sys / root to 786432 (4k) blocks.
 Begin pass 1 (max = 9)
 Extending the inode table XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 The filesystem on / dev / sys / root is now 786432 blocks long. 

8. Mount a copy
mount -n / dev / sys / root / mnt / newroot

9. Substitute root FS
This is what I started all this for.
From this moment comes a dangerous time.
Important: After replacing the root FS, if the SSH session is terminated, the system will not be able to establish a new connection!
Substitute FS:
cd / mnt / newroot
  pivot_root.  mnt / oldroot 

It is important to execute these two commands exactly as they are written in order to avoid blocking due to the current working folder (cwd).
After this command, the / dev / sys / root volume will fall into place /, and the sda2 partition will change the mount point to / mnt / oldroot. In this case, all other mounted FS will also change the mount point. For example, the / dev file system moves to / mnt / oldroot / dev.

10. Return the mount points of all other FS (except for the old root FS)
We transfer the standard filesystem that most have:
mount -n --move / mnt / oldroot / proc / proc
mount -n --move / mnt / oldroot / dev / dev
mount -n --move / mnt / oldroot / sys / sys

Now you can look in / proc / mounts, what else do you need to return to the place
cat / proc / mounts | grep oldroot
  rc-svcdir /mnt/oldroot/lib64/rc/init.d tmpfs rw, nosuid, nodev, noexec, relatime, size = 1024k, mode = 755 0 0
 / dev / root / mnt / oldroot ext3 ro, noatime, errors = continue, barrier = 1, data = writeback 0 0
 / dev / mapper / sys-distfiles / mnt / oldroot / var / distfiles ext2 rw, noatime, errors = continue 0 0
 / dev / mapper / sys-vardb / mnt / oldroot / var / db reiserfs rw, noatime 0 0 

In my example, I transfer
mount -n --move /mnt/oldroot/lib64/rc/init.d /lib64/rc/init.d
mount -n --move / mnt / oldroot / var / distfiles / var / distfiles
mount -n --move / mnt / oldroot / var / db / var / db

From now on, you are out of danger. New SSH sessions should open successfully.

11. Restart applications with new root filesystem.

11.1. We look, process files
lsof / mnt / oldroot
  ...
 agetty 28783 root mem REG 8,2 52884 99214 /mnt/oldroot/lib64/libnss_nis-2.13.so
 agetty 28783 root mem REG 8.2 109446 98354 /mnt/oldroot/lib64/libnsl-2.13.so
 agetty 28783 root mem REG 8,2 38724 99216 /mnt/oldroot/lib64/libnss_compat-2.13.so
 agetty 28783 root mem REG 8,2 1898114 99218 /mnt/oldroot/lib64/libc-2.13.so
 agetty 28783 root mem REG 8,2 156052 99219 /mnt/oldroot/lib64/ld-2.13.so
 udevd 29118 root txt REG 8,2 130216 49072 / mnt / oldroot / sbin / udevd
 udevd 29118 root mem REG 8,2 62227 99210 /mnt/oldroot/lib64/libnss_files-2.13.so
 udevd 29118 root mem REG 8,2 52884 99214 /mnt/oldroot/lib64/libnss_nis-2.13.so
 udevd 29118 root mem REG 8.2 109446 98354 /mnt/oldroot/lib64/libnsl-2.13.so
 udevd 29118 root mem REG 8,2 38724 99216 /mnt/oldroot/lib64/libnss_compat-2.13.so
 udevd 29118 root mem REG 8,2 135986 99220 /mnt/oldroot/lib64/libpthread-2.13.so
 udevd 29118 root mem REG 8,2 1898114 99218 /mnt/oldroot/lib64/libc-2.13.so
 udevd 29118 root mem REG 8.2 48545 99211 /mnt/oldroot/lib64/librt-2.13.so
 udevd 29118 root mem REG 8,2 156052 99219 /mnt/oldroot/lib64/ld-2.13.so
 sshd 29455 root txt REG 8.2 482592 74222 / mnt / oldroot / usr / sbin / sshd
 sshd 29455 root mem REG 8,2 62227 99210 /mnt/oldroot/lib64/libnss_files-2.13.so
 sshd 29455 root mem REG 8,2 52884 99214 /mnt/oldroot/lib64/libnss_nis-2.13.so
 sshd 29455 root mem REG 8,2 38724 99216 /mnt/oldroot/lib64/libnss_compat-2.13.so
 sshd 29455 root mem REG 8,2 440512 75579 /mnt/oldroot/usr/lib64/libgmp.so.10.0.2
 sshd 29455 root mem REG 8,2 1898114 99218 /mnt/oldroot/lib64/libc-2.13.so
 sshd 29455 root mem REG 8,2 135986 99220 /mnt/oldroot/lib64/libpthread-2.13.so
 sshd 29455 root mem REG 8,2 98598 99206 /mnt/oldroot/lib64/libresolv-2.13.so
 sshd 29455 root mem REG 8,2 40981 99205 /mnt/oldroot/lib64/libcrypt-2.13.so
 sshd 29455 root mem REG 8.2 109446 98354 /mnt/oldroot/lib64/libnsl-2.13.so
 sshd 29455 root mem REG 8,2 92624 98190 /mnt/oldroot/lib64/libz.so.1.2.5
 sshd 29455 root mem REG 8,2 14367 99217 /mnt/oldroot/lib64/libutil-2.13.so
 sshd 29455 root mem REG 8.2 19321 99203 /mnt/oldroot/lib64/libdl-2.13.so
 sshd 29455 root mem REG 8,2 1699456 74274 /mnt/oldroot/usr/lib64/libcrypto.so.1.0.0
 sshd 29455 root mem REG 8,2 373376 76226 /mnt/oldroot/usr/lib64/libssl.so.1.0.0
 sshd 29455 root mem REG 8,2 51760 98532 /mnt/oldroot/lib64/libpam.so.0.83.1
 sshd 29455 root mem REG 8,2 156052 99219 /mnt/oldroot/lib64/ld-2.13.so
 smartd 29484 root txt REG 8.2 374520 75866 mnt / oldroot / usr / sbin / smartd
 smartd 29484 root mem REG 8,2 1898114 99218 /mnt/oldroot/lib64/libc-2.13.so
 smartd 29484 root mem REG 8.2 5.233499 82722 /mnt/oldroot/usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.3/libgcc_s.so.1
 smartd 29484 root mem REG 8,2 614022 99202 /mnt/oldroot/lib64/libm-2.13.so
 ... 

We see that all processes are running from the old root FS.

11.2. Start restarting system processes.
I recommend first:
/etc/init.d/udev restart
/etc/init.d/sshd restart

11.3. We open the second ssh session on the server
If the login is successful, then the first session needs to be completed in order to close the old bash shell and fork of the old sshd.
look lsof / mnt / oldroot
All sshd processes running from the old root filesystem should disappear.

11.4. Unusual processes.
Restart agetty and init

With agetty (or other * tty), everything is simple:
killall agetty
Don't be afraid, init will restart them.

Init init restarted command
telinit u

11.5. Mounting file systems that were previously disabled
I mount squashfs in / usr / portage
mount / usr / portage

11.6. We start the services stopped earlier
In my case, I run:
/etc/init.d/rsyslog start
  * Starting rsyslog ... [ok] 

/etc/init.d/snmpd start
  * Starting snmpd ... [ok] 

/etc/init.d/vixie-cron start
  * Starting vixie-cron ... [ok] 

11.7. We continue to restart services

look at lsof / mnt / oldroot, and restart what's left

/etc/init.d/ntpd restart
/etc/init.d/radvd restart
/etc/init.d/smartd restart
/etc/init.d/dnsmasq restart

Including, I restart the virtual machines that have been quietly working all this time.
Moreover, now there is no special need to hurry.
We restart the services, only to unmount the old root filesystem.
/etc/init.d/kvm.204 restart
/etc/init.d/kvm.205 restart
/etc/init.d/kvm.206 restart

12. Operations after root FS substitution

12.1. Do not forget to change fstab
I use LABEL = tags, so I’m not changing anything
  LABEL = root / ext3 noatime 0 1 

12.2. Unmount the old root FS
umount / mnt / oldroot
rmdir / mnt / oldroot / mnt / newroot
It is no longer used by anyone.
For those who want to keep the old root file system, I recommend changing LABEL and UUID in it so that it does not confuse the bootloader.
tune2fs -L oldroot -U $ (uuidgen) / dev / sda2
For myself, I no longer see the need for the old FS. Delete.
wipefs / dev / sda2 -o 0x438

12.3. Do not forget to add / change the initramfs, when switching to LVM

12.4. Do not forget to reconfigure the bootloader
In my case, this is grub2.
Install the bootloader on sda
grub2-install --no-floppy / dev / sda
  Installation finished.  No error reported. 

We update the configuration:
grub2-mkconfig -o /boot/grub2/grub.cfg
  Generating grub.cfg ...
 Found linux image: /boot/vmlinuz-3.2.12-gentoo-64-beaver-b
 Found initrd image: /boot/initrd-3.2.12-gentoo-64-beaver-b
 done 

Remarks


After moving to a new root FS server, you can safely continue working without rebooting.

Source: https://habr.com/ru/post/141320/


All Articles