This article is a continuation of a series begun in publications.
Containers are the future of clouds and
Containerization on Linux in detail - LXC and OpenVZ. Part 1 .
If you can skip the publication about the future, then the article “Part 1” will be required to read in order to understand what we are talking about here.

')
Technical implementation of the container hardware resource limiting subsystem
For the completeness of the description, we must necessarily touch on the aspect of the delimitation of not only system resources and rights, but also hardware resources.
What resources we need to share between users:
- CPU
- Hard disk (load on it)
- Memory (volume)
For all such restrictions, the cgroups subsystem is used. I / O load can be fixed using the blkio cgroups subsystem, and it is important to note that there is both the ability to set hard limits in bytes / second and operations per second (IOPS), and the ability to set weights (i.e., 10 % of the entire server). Memory is limited by the memory cgroup, everything is pretty simple - we indicate the amount of RAM, if the container exceeds it - the process experiences an OOM message. For the processor, only the possibility of specifying the load in percent is permissible, which is explained by the peculiarities of the scheduler implementation on Linux.
So, to implement the delineation of resource use, we used the following cgroups:
Common problems when using containers
When I
described all the advantages of containers, I deliberately decided not to touch on the shortcomings of containers, since this is a very voluminous topic and a lot of explanations are required.
So, let's go:
- In Linux upstream containers, there are problems with isolation of the / proc file system between containers. OpenVZ has solved this problem.
- In Linux upstream containers it is impossible to limit the amount of disk available to the container without using completely separate file systems (which in turn is inconvenient and very inconvenient in support). As a promising solution to this problem in the future, it is worth noting the functionality of the subvolume of the Btrfs file system, which allows you to create a completely isolated section (with a fixed volume) within one file system.
- As a client OS, using containerization, only the same OS that is running on a physical server can be used. This is not a lack of technical context, it is a feature of implementation. That is, you can run only Linux on a physical Linux machine. If it is necessary to start another OS, KVM will provide a very good service, which is perfectly supported on both OpenVZ cores and Linux upstream (to its credit, it’s better there).
- Less security than full virtualization. Creating an exploit that from a container will lead to the failure of the hardware node itself is potentially possible. In the case of using containers from the Linux kernel, the number of ways to disable the hardware server is decidedly higher, as in OpenVZ isolation is much better (due to granular UBC limits, as well as additional checks that are not available in the upstream core).
- For stable use (but again, we are not talking about production!) Linux upstream containers are ready starting at about 3.8 kernel (ideally, 3.10), while many currently stable kernel distributions are younger and there is no possibility to use all the functionality. A pretty good option for working with Linux upstream containers is the kernel from Oracle , it is just version 3.8 and declares the readiness of containers for industrial use.
- There is no support from the manufacturers of distros, for example, standard means even Fedora 20 cannot be placed inside the container, and in a virtual machine it is possible. The installation question is a simpler one for Debian and the debootstrap package can easily install the required distribution. In the case of OpenVZ, the issue is solved using pre-assembled OS images .
- The problem with management utilities. Since this is an extremely important point, in my opinion, I decided to dwell on it in more detail and painted it separately at the end of the article.
- The lack of a built-in solution for limiting the speed of a network connection — this problem affects both Linux upstream containers and OpenVZ. Of course, it is possible to create your own solutions based on tc, but such a useful function simply must be.

The advantages of OpenVZ over standard Linux containerization
We have already discussed that OpenVZ and Linux upstream containers are very similar technologies with similar architecture and implementation, but they have a lot of differences. Part of the difference is due to the fact that the current version of OpenVZ is supported for the 2.6.32 RHEL kernel. Stop! Do you see the same thing as me? 2.6.32 kernel, really? But please do not be afraid that this kernel is old - this is not so, because Red Hat does a tremendous amount of work on backporting code from new branches and this kernel is functionally very close to 3.x and at the same time extremely far from the standard “vanilla” 2.6. 32.
Thus, if we compare the RHEL6 OpenVZ core and the current version of the upstream core (3.12). The isolation level of processor and disk resources is on the same level, but there are a number of subtleties worth paying attention to, they are lower.
What exactly we have in OpenVZ, which is not in the upstream Linux kernel:
- The vSwap system is used, which allows you to configure overcommit for the container, and also allows you to issue a virtual one (because it performs the deceleration artificially and at a speed of about 100 megabytes / second) SWAP
- A more accurate calculation of RAM consumed by containers at the expense of more granular UBC counters, you can calculate the occupied memory of the kernel, memory of sockets, allocated memory and really busy, the number of shm pages of memory and much more
- The ability to account for page cache consumption by each of the containers
- Storage system with the possibility of limiting the location of the available container, built on the basis of ploop with functionality similar to LVM, with the ability to create snapshots, the ability to increase / decrease in online mode. This file system in turn provides isolation level at the level of full virtualization systems.
- Support for live migration from server-to server with zero downtime (loss of 1 ping) and the use of a very efficient algorithm
As you can see, almost all the benefits of OpenVZ are focused on the actual operation of the solution, and not on the provision of mechanisms to implement this or that opportunity, as they do in Linux upstream containers.

Problem from the user space
Unfortunately, there is no unified way to manage containers. In OpenVZ, vzctl is used for this, which, by the way, can also manage containers even on a regular
upstream kernel since versions 3.x, and it is also included in the Fedora-20 distribution and can be installed and used without external dependencies. There are also LXC and docker (which in turn is also based on LXC), but they also do not fully cover all the functionality that container users may need. In addition, Linux Upstream Containers are used in a very unusual place - in the systemd initialization system used in many distributions, the
systemd-nspaw functionality .
Therefore, the question of implementing a convenient, competent, flexible and well-designed framework for managing containers in Linux is open and you can change the world by writing it (smile) You can also learn a lot about container management utilities from the Linux Plumbers conference 2013

findings
As we have seen, Linux containerization is developing very actively, in the coming years, we will certainly get full containerization ready for industrial use in the upstream core. But for industrial use (here I mean using to isolate clients from each other, not from their own services) containers are only ready when using OpenVZ. But if you need containerization for your own purposes - then Linux upstream containers are an excellent choice, just worry about the latest kernel version.
I would also like to thank Andrey Wagin
avagin for his help in editing particularly complex technical issues.
Regards, Pavel Odintsov
CTO
FastVPS LLC