📜 ⬆️ ⬇️

Backing up virtual machines in a QEMU / KVM hypervisor environment

image

As you know, backups need to be done, moreover, you need to do them so that you can then turn around with them. This is especially true of virtual machines (VMs). Consider how you can make a backup of the virtual disks of the machine in the environment QCOW / KVM. There are two main problems here: first, you need to get a consistent (complete) backup, i.e. if we have a DBMS or other software that actively uses its own cache for writing, then before backup it should be asked to reset the cache and freeze the recording to disk, otherwise the data will get into the snapshot, but not those, and when recovering the DBMS may not understand such a trick. The second question is the performance of the VM in the snapshot mode, it would be nice that the VM would not be too slow when we make a copy, and it would not hang when we remove the snapshot.

Immediately I will answer the first question - to get a consistent backup, you need to turn off the VM using the guest OS before creating the backup, then the backup will definitely turn out to be complete. If you are satisfied with this situation - the article can not be read further. If not, I ask for cat.

So, to get a consistent backup without turning off the VM, you need to use the guest agent for QEMU (not to be confused with the guest agent for QEMU SPICE and the paravirtual drivers for QEMU in general). In the Debian Jessie repository, this is the qemu-guest-agent package, in Wheezy the package is available only through wheezy-backports. The QEMU guest agent is a small utility that accepts commands from the host through a virio channel named org.qemu.guest_agent.0 and executes them in a guest context. On the hypervisor side, the channel ends with a unix socket, into which you can write text commands using the socat utility. Libvirt, however, loves itself occupies this channel, so if you use Libvirt to manage the hypervisor, you will have to communicate with the guest through the command “virsh qemu-agent-command”. The QEMU guest agent commands are different, for example, my list:
')

A brief description of the commands is in the qga / qapi-schema.josn file in the QEMU source code, and the full can be obtained by analyzing the files qga / commands-posix.c and qga / commands-win32.c. From the analysis you can, for example, find out that the guest-set-vcpus, guest-get-vcpus, guest-network-get-interfaces, guest-suspend-hybrid, guest-suspend-ram, guest-suspend-disk commands for Windows are not are supported, and the guest-fsfreeze-freeze / guest-fsfreeze-thaw commands try using Windows shadow volume copying - VSS. However, since the article focuses on Linux as a guest, these subtleties will not concern us.

Among the entire list of commands we are interested in guest-fsfreeze-freeze and guest-fsfreeze-thaw. As the name suggests, the first one “freezes” the guest's file system, and the second one “unfreezes” it. The fsfreeze command (more precisely, IOCTL) is not a QEMU feature, but the ability of a virtual guest file system, which has appeared in the Linux kernel for quite some time. That is, it is possible to freeze filesystem not only in a virtual environment, but also on real hardware; it is enough to use the fsfreeze utility from the util-linux package. The fsfreeze man page says Ext3 / 4, ReiserFS, JFS, XFS are supported, but I have fsfreeze “frozen” and Btrfs. Before the actual “freezing”, but after all the write streams have completed, the kernel code calls sync () (file fs / super.c, line 1329), so you should not worry about data integrity. In general, “freezing” of filesystems is necessary for the kernel to get complete snapshots of LVM volumes, and not for dubious fun with virtualization systems.

So, we know that to get a complete snapshot, we need to call the guest-fsfreeze-freeze function from the guest using the QEMU guest agent. However, maybe we worry in vain and this function will be called when creating a snapshot? Alas, for Libvirt (2.9), and for Proxmox (pvetest branch), and for Openstack, this is not so, and to automate the call to the guest-fsfreeze-freeze function, you need to edit the source codes of the respective products, which is beyond the scope of this article.

Libvirt still can freeze guest FS
As the respected Infod suggests , the Libvirt virsh shell can be passed to the --quiesce parameter when creating a snapshot, which guest-fsfreeze-freeze will call when creating a snapshot:
virsh snapshot-create-as myvm snapshot1 "snapshot1 description" --disk-only --atomic --quiesce 

Suppose we have found a way (for example, using a self-written script) to “freeze” a guest file system before taking a snapshot. Now we have the next task - to notify the guest software directly before freezing. The QEMU guest agent supports the -F parameter, which says that before the “freeze” and after the “defrost” you need to call the script / etc / qemu / fsfreeze-hook and the parameters freeze and thaw, respectively. Therefore, in the Debian agent startup script (/etc/init.d/qemu-guest-agent) will have to be corrected: DAEMON_ARGS = "- F". Keep in mind that if the script ends with an error, the freeze of the file system will not occur.

For the MySQL server, the first one that came to mind, but a non-working script might look something like this:

 #!/bin/bash USER="<>" PASSWORD="<>" case "$1" in freeze ) /usr/bin/mysql -u $USER -p$PASSWORD -e "FLUSH TABLES WITH READ LOCK;" exit $? ;; thaw ) /usr/bin/mysql -u $USER -p$PASSWORD -e "UNLOCK TABLES;" exit $? ;; * ) logger Fsfreeze script has activated with unknown parameter: $1 exit 1 ;; esac exit 1 


In fact, the lock will be removed from the base immediately upon completion of the command.
 mysql -u $USER -p$PASSWORD -e "FLUSH TABLES WITH READ LOCK" 
due to the fact that all locks in MySQL work only as long as the user who installed them is present in the system. For the correct backup, you will have to write an additional small service (for example, in Python) that will open the MySQL database and block using the freeze command, and then not close the database and wait for the thaw command.

And what about Windows as a guest?
I must say that for Windows and MS SQL the same procedure does not require any physical movements - the QEMU guest agent automatically calls the corresponding function of the shadow copy service of the VSS volume, VSS informs all subscribers that a backup is about to start and it would not be enough to “reset” to disk, etc.


So, we blocked the MySQL tables and “frozen” the guest file system, it’s time to make a backup. Suppose we store a VM disk image in qcow2 format files, and not, for example, in the form of LVM volumes. Even in this case, we are offered many options, it would be nice to understand them.

Internal QEMU snapshotExternal QEMU snapshotQEMU backupSnapshot LVM volumes with qcow2 filesSnapshot FS Brtfs with qcow2 files
MeansQEMUQEMUQEMUOSOS
QEMU teamsavevm / snapshot_blkdev_internalsnapshot_blkdevdrive_backup
Libvirt / virsh commandsnapshot-create / snapshot-create-assnapshot-create / snapshot-create-as
OS commandlvcreatebtrfs subvolume snapshot
ViewRecords inside the disk imageSeparate file - disk imageSeparate file - disk imageBlock deviceFS (catalog) with disk images
ScopeSpecific VMSpecific VMSpecific VMAll storageAll storage
TechnologyRedirecting writing to another area of ​​the same fileRedirect writing to another fileFull copying of machine disks to another fileCopying original data to a snapshot device as they changeRedirecting writing to another area of ​​the file system
Copying snapshot to backup storageqemu-nbd / nbd-clientCopy fileCopy fileMounting a Snapshot, Copying a FileCopy file
VM disk performance per write when snapshot is createdAverage (for each entry, you need to make two sync (), option qcow2: lazy refcounts improves the situation)HighHighAbout 2 times lower than usualHigh
Load on storage when deleting (commit) snapshotBelow average (need to rewrite metadata)High (copy the data to the original image and rewrite the metadata)Low (need to delete file)Low (need to remove block device)Below average (need to rewrite metadata)
Load on storage when rollback to snapshotBelow average (need to rewrite metadata)Low (need to delete file)Low for Libvirt (need to replace the file), high for Proxmox (need to unzip the file from the archive)High (need to copy data to the original block device)Below average (need to rewrite metadata)

Each method has its pros and cons. Thus, the “Internal” method is, in fact, the standard in the Virt-Manager utility and the Proxmox environment, and the acquisition of snapshots of this format is automated. However, in order to “pull out” the snapshot from the file, you need to raise the NBD server based on qemu-nbd and connect the image file to it. In the “External” method, a backup file ready for copy is obtained during the process of creating a snapshot, but the process of deleting the snapshot is not simple and involves “blocking” the recorded data from the snapshot file to the base image, which is accompanied by a multiple increase in the write load in snapshot removal process. For example, VMWare ESXi in the same situation “sags” on write performance 5 times. . It must be said that there is another way to remove the “External” type of snapshot - copying all the blocks from the original image into the snapshot . This method is called block-stream, I don’t try to judge about the expediency of using it in production, but obviously, this will be a good benchmark for storage.

The LVM volume snapshot will cause a drop in the performance of the main volume on the record, so it is better to use it when we are sure that they will not write to the disk intensively during the snapshot.

The use of the BTRFS file system as a file system for the storage of disk image files is promising, since in this case snapshots, compression and deduplication are provided by the FS architecture itself. Disadvantages - Btrfs cannot be used as a shared filesystem in a clustered environment; in addition, Btrfs is a relatively new file system and, perhaps, it is less reliable than a bunch of LVM and ext4.

The method of getting backups with the drive_backup command is good because you can immediately create a backup on the mounted remote storage, but in this case it creates a heavy load on the network. For the remaining methods, it is possible to provide for the transfer of only modified blocks using rsync. Unfortunately, QEMU backup does not support the transfer of only “dirty” (modified since the last backup) blocks, as it is implemented, for example, in the VMWare CBT mechanism. Both attempts to implement such a mechanism — livebackup and in-memory dirty bitmap were not accepted into the main QEMU branch; apparently, the first is due to the architecture (an extra demon is added and a separate network protocol is used only for this operation), the second is due to obvious restrictions in use: the map of "dirty" blocks can only be stored in RAM.

In conclusion, we consider a situation in which a VM has several mounted disk images. Obviously, for such a VM you need to create snapshots of all disks at the same time. If you use Libvirt, then you have nothing to worry about - the library takes all the care of synchronizing the snapshots themselves. But, if you want to perform this operation on a “pure” QEMU, then there are two ways to do this: stop the VM with the stop command, get snapshots, and then continue executing the VM with the cont command or use the mechanism for the transactional execution of commands available in QEMU. You cannot use only QEMU guest agent and guest-fsfreeze-freeze / guest-fsfreeze-thaw for this purpose, because even though the agent “freezes” all mounted filesystems in one command, it does not work simultaneously, but sequentially, so that out-of-sync is possible between volumes.

If you find an error in the article, or you have something to say, I ask in the comment.

Make backups, gentlemen!

Source: https://habr.com/ru/post/242213/


All Articles