📜 ⬆️ ⬇️

Modern virtualization capabilities

After recent discussions about which hypervisor is better, the idea arose to write out the functionality of modern virtualization systems without reference to specific names. This is not a “who's better” comparison, it is an answer to the question “what can be done using virtualization?”, A general overview of the possibilities of industrial virtualization.

Code execution

Since the hypervisor fully controls the virtual machines, it can specifically control the operation of the machine.

Different virtualization systems offer several code execution methods (full emulation is not included in the list, as it is not used in industrial virtualization):

In reality, paravirtualized drivers are used in both HVM and binary rewriting (often called guest tools), because it is in I / O operations that paravirtualization significantly outperforms all other methods in performance.

Without exception, all hypervisors are able to perform a pause (suspend / pause) operation for a virtual machine. In this mode, the operation of the machine is suspended, possibly, with the preservation of the memory data on the disk and the continuation of the work after the “recovery” (resume).

A common feature is the concept of migration - transfer of a virtual machine from one computer to another. It happens offline (turned off on one computer, turned on on the second) and online (usually called live, ie “live migration”) without shutdown. In reality, it is implemented using suspend on one machine and resume on another with some optimization of the data transfer process (first the data is transferred, then the machine drops and the changed data is transferred since the start of the migration, then the new machine is started on the new host).

Also, in Xena, promised (and, it seems, almost brought to product) the technology of parallel execution of one machine on two or more hosts (Remus), which allows the virtual machine to continue operating in the event of a server failure without interruptions / reboots.

Memory management

The classical virtualization model implies the allocation of a fixed amount of memory to the guest, the change is possible only after it is “turned off”.

Modern systems can implement the functionality of manual or automatic changes in the amount of RAM for the guest system.

The following memory management methods exist:


Some hypervisors allow a virtual machine to access real hardware (moreover, different virtual machines, different hardware).

They can also provide equipment emulation, including one that is absent on the computer. The most important of the devices - the network adapter and disk are considered separately; among other things: video adapters (even with 3D), USB, serial / parallel ports, timers, watchdogs.

One of the following technologies is used for this:

Network devices

Network devices are usually implemented on either the third or the second level of abstraction. The created virtual network interface has two ends - in the virtual machine and in the hypervisor / control domain / virtualization program. The traffic from the guest is transmitted unchanged to the host (without any dancing with replayings, speed matching, etc.). And then quite significant difficulties begin.

Currently, minus systems emulating a network interface at the third level (the level of IP addresses, for example, openvz), all other systems provide the following set of features:

In some virtualization systems, the case of bridging a network interface of a virtual machine with a physical network interface and the presence of a virtual switch are separated separately.

In general, the network of virtual machines delivers a special headache during migration. All existing product systems with interface bridging allow transparent live migration of machines only in one network segment, it requires special tricks (fake ARP) to notify upstream switches about the port change for traffic switching.

At the moment, a rather interesting system has been developed - open vSwitch, which allows to carry out the task of determining the path of a packet to an open-flow controller - it is possible that it will significantly expand the functionality of virtual networks. However, open flow and vSwitch are a little apart from the topic (and I will try to talk about them a bit later).

Disk (block) devices

This is the second extremely important milestone in the work of virtual machines. The hard disk (more precisely, the block device for storing information) is the second, and maybe even the first, by the importance component of virtualization. The performance of the disk subsystem is critical for assessing the performance of the virtualization system. A large overhead (overhead) on the processor and memory will be experienced more easily than an overhead on disk operations.

Modern virtualization systems offer several approaches. The first is to provide a virtual machine ready file system. Overhead costs tend to zero (specifically for openvz). The second is in the block device emulation (without any ryushechek like smart and SCSI commands). A block device from a virtual machine is bound either to a physical device (disk, partition, LVM logical volume), or to a file (via a loopback device or by direct emulation of block operations “inside” the file).

An additional possibility is the use of network storage by the hypervisor - in this case, the migration process is very simple: on one host the machine will be paused, on the second they continue. Without transferring any data between hosts.

However, most systems, provided that the block device of the underlying level supports it (LVM, file), provide the ability to change the size of the virtual block device on the go. That on the one hand is very convenient, on the other - the guest OS is not ready for this at all. Of course, all systems support adding / removing block devices on the go.

Deduplication functions are usually assigned to the underlying block device provider, although, for example, openvz allows you to use copy-on-write mode using the “container template”, and XCP allows you to make a chain of block devices with copy-on-write dependencies from each other. This, on the one hand, slows down productivity, on the other hand, it allows to save space. Of course, many systems allow you to allocate disk space on-demand (for example, VMWare, XCP) - a file corresponding to a block device is created as sparsed (or has a specific format with support for “skipping” empty spaces).

Access to disks can be controlled by speed, by priority of one device (or virtual machine) relative to another. VMWare announced a great opportunity to control the number of I / O operations, providing a small delay in servicing all guests, slowing down the most hungry ones.

Dedicated disk devices can be shared between several guests (using file systems that are ready for this, for example, GFS), which makes it possible to implement clusters with shared storage with ease.

Since the hypervisor completely controls the guest’s access to the media, it is possible to create snapshots of disks (and virtual machines themselves), to build a snapshots tree (who is out of whom) with the ability to switch between them (usually, the snapshots still include the state of virtual memory). cars).

Similarly, backups are implemented. The easiest way to implement a backup is by copying a disk of a backup system - this is a regular volume, file, or LV partition that is easy to copy, including on the go. For Windows, it is usually possible to notify shadowcopy about the need to prepare for backup.

The interaction between the hypervisor and the guest

In some systems, there is a message mechanism between the guest system and the hypervisor (more precisely, the managing OS), which allows you to transfer information regardless of the network operability.

There are experimental developments (not ready for product) on the "self-migration" of the guest system.

Cross compatibility

Work is underway to standardize the interaction between hypervisors. For example, XVA is proposed as a platform independent format for exporting / importing virtual machines. The VHD format could claim to be universal if it were not for several incompatible formats under the same extension.

Most virtualization systems provide the ability to "convert" competitors' virtual machines. (However, I did not see any live migration system that would allow the machine to migrate between different systems on the move, and I did not even see any sketches on this topic).


Most hypervisors provide one or another host load estimation mechanism (showing current values ​​and a history of these values). Some provide the ability to accurately account for consumed resources in the form of an absolute number of ticks, iopes, megabytes, network packets, etc. (as far as I know, it is only in Xen, and only in the form of undocumented features).

Association and management

Most of the latest generation systems allow you to combine several virtualizing machines into a single structure (cloud, pool, etc.), either by providing the infrastructure to manage the load, or by providing immediately ready service to manage the load on each server in the infrastructure. This is done firstly by the automatic choice of “where to start the next car”, secondly by the automatic migration of guests to evenly load the hosts. At the same time, the simplest fault-tolerance (high avability) is also supported when using shared network storage — if one host with a stack of virtual machines has died, then the virtual machines will run on other hosts that are part of the infrastructure.

If I missed some significant features of some of the systems, say, I will add

Source: https://habr.com/ru/post/101447/

All Articles