FastVPS: How we changed the virtualization platforms

Pavel Odintsov, CTO of FastVPS Eesti OU

We have been engaged in the rental of virtual (VPS) and dedicated servers for almost 7 years and now support more than 170,000 sites of our clients. During this time, we managed to change the virtualization platform a couple of times, trying both Xen, and OpenVZ, and Parallels Cloud Server, and eventually settled on PCS. Why we changed platforms, with what parameters they were compared, what pleased us in them, and what, frankly, we were unhappy - under the cut.

Almost 6 years ago, we were among the first to offer virtual server rental services on the market, and we chose the Debian Etch 4.0 based Xen hypervisor as the primary VPS platform, since we already had experience with this distribution, and the Xen-enabled kernel there was in the official repository.

Years passed, new customers appeared, and questions about the reliability and isolation level of the Xen platform grew proportionately. At that time, there were failures on our equipment with unpleasant regularity (there were serious problems with network card drivers), which greatly prevented us from providing a quality service to our customers.
')
Having suffered for about a year, we tried to transfer several clients, who especially complained about Xen crashes, to the server from OpenVZ (why not KVM? Very simple - then it was at a very raw level and was not recommended for industrial use). Our joy knew no bounds. OpenVZ, based on the perfectly tested core of RHEL 5, showed itself from the best side - problems with iron were gone and the speed of operation of virtual servers on the same equipment with the same density of clients as Xen increased. What else can a hosting provider dream of?

But we continued to grow, and for new powerful tariffs we needed a faster and more complex platform, the role of which OpenVZ no longer suited due to insufficiently high speed of operation under conditions of very large hardware server resources (especially in the case of NUMA architectures). In particular, if 4-8 physical cores are allocated to one virtual server on NUMA architectures, OpenVZ does not optimally distribute virtual cores across physical NUMA nodes, which results in a serious performance failure. The level of I / O isolation was also not high enough, which led to delays in the operation of the file system inside the VPS. Tired of these problems, we again began to look for a solution for virtualization / containerization.

We were looking for a short time: we talked with Parallels, and they quickly assured us that all these problems were solved in their Parallels Cloud Server, and there are still a huge number of functions that would be useful for both us and our users.

First, a few words about the core of the system - both PCS and OpenVZ use a kernel based on RHEL 2.6.32, but there are many differences between them. The key difference for us between PCS and OpenVZ is the higher density of virtual servers and their better isolation from each other.

What good is in PCS compared to OpenVZ
It used to happen that one of the containers clogged the file system log, and due to the poor isolation of the file systems of the containers, this had a disastrous effect on the performance of all other containers. In PCS (as well as in fresh versions of OpenVZ), this problem is solved by the ploop file system. Its essence is that instead of a single file system for all containers, each container has its own separate filesystem. This allows you to completely avoid problems when the entire file system with container data depends on the actions of a single user.

The second improvement we received due to more efficient use of memory. Previously, the platform cached in RAM all instances of identical files in different containers, for example, dozens of different copies of the curl or libc libraries were cached separately for each container. It was very clogged memory. With PCS, we managed to free up about 15% of RAM using the pfcache mechanism, which cleverly caches file I / O and does not load a new copy of the library each time, but instead puts a link to the already loaded “neighbor” memory area. This, of course, only works if the versions of the libraries are identical.
Another problem we managed to overcome was poor I / O isolation between containers. In OpenVZ, we have often come across the fact that due to low I / O isolation in cases where, for example, one of the containers clogged the I / O queue with millions of files, in all other containers on the same I / O node mercilessly braked. In PCS, this is solved by using the cgroups kernel subsystem (Parallels claims that their specialists were directly involved in its development), which isolates file I / O between containers. Cgroups plus its own file system for each container gives perfect (well, almost) isolation of virtual environments. As you have probably understood, the FS journal is now not one common to all containers, but separate for each container.

The fourth problem was related to the fact that users often and often contacted support with questions about how to run a broken container for one reason or another, as well as problems when the container did not have enough memory or hard disk. In PCS, we brought out the recovery functions, VNC access and utilities for monitoring the state of the virtual machine directly to users, so that they themselves can now do what they have previously tightened their support for. As a result, the number of appeals on these issues has decreased by about a third.

Well, from the "little things":
The Power Panel pleased us, but did not say that I was struck - nothing super-innovative, but there is a complete set of fairly convenient graphical and console utilities for solving any problems without attracting our support. Newbies who came up against virtualization for the first time almost didn’t apply to support when installing and managing virtual machines.

The backup system works quickly, creates minimal load on the disk system (due to support for ploop snapshots), and also has support for incremental backups (which, in turn, saves us a lot of disk space).

What PCS doesn’t have, and how we had to decide
Now, about what rake we came during the deployment of the Parallel Cloud Server and what ways we came up with to cope with them.

1. The Centralized Control Panel (PVA MN) does not have an API, and all operations must be performed through the web interface or through specific server APIs.
How to fight: We had to manage the creation / suspension of the VPS directly through the node's pva agent, and also add all the nodes to the pva management node to implement the functions of the entire installation (migration first). In Parallels, we have already been promised to implement this function in one of the upcoming releases, we will wait and see what happens.

2. There is no possibility to change the OS without deleting the VPS and creating it again. This is an extremely problematic feature, which fundamentally conflicts with the essence of the VPS service. Clients very often change operating systems from one to another, and the lack of such a function in the Power Panel seriously increases the burden on our support.
How to fight: Now, to change the OS, the client needs to write to our support, she, in turn, initiates the removal of the container on the server, corrects the OS and installs the container again. In general, this is far from a convenient option, and it takes about 15-20 minutes to complete.
The guys from Parallels promised us to make it possible to leave the ID during the re-creation of the container, which would solve most of the compatibility issues (but not all).

3. Lack of support for shaping incoming traffic and extremely unstable operation of the current cbq shaper of outgoing traffic. This negates the possibility of maximally dense placement of low-power containers on the node - a small client can unload the gigabit channel. Shaper for us, of course, mega-critical, as it limits us in the possibilities of creating new tariff plans.
How we struggle: As a temporary solution, our own tc / htb shaper-based script is used , which in principle works acceptable, but managing it bypassing the PCS API is rather difficult. Therefore, the shaper from Parallels would like to be among the first and as soon as possible.

Our script for those who need it.

4. Lack of courses / Russian-language documentation - here, first of all, there is a problem with the documentation for clients and our support staff.
How we fight: Well, our administrators are experienced, of course, there are no problems with English and installation and configuration skills. But customers had to help, respectively, there is an extra burden on support.

5. A very complex template building system based on a manually configured container. Often you need to configure the container manually and then create the rest on its base. Now it can be done in two ways: either through the EZ Template, but in this case you cannot put complex scripts in the container. If you need some kind of complex container, you have to clone it manually, and this is inconveniently implemented and works only within one server.
How to fight: Now we are implementing through the existing completely non-intuitive mechanism. I would like to be able to create a library of typical container images.

6. The set of application templates for Linux OS is now very poor and does not cover even the basic needs of users. Because of this, we have a lot of client requests for installing additional software (for example, memcached, redis, openvpn).
How to fight: Requests are solved manually by technical support.

7. There are no examples of working with the Agent API in several programming languages. Ideally, I would like to have a ready-binding library.

8. There is no possibility of licensing not only at the level of a single server, but also immediately for the entire cluster of servers.
How to fight the points 7-8: Here it is clear, while working with what is.

9. I would like more attention to the acceleration of local disk storage (for example, due to SSD caching). We plan to solve this problem by introducing a one-way solution, perhaps even Parallels Cloud Storage.
How to fight: At the moment, quite expensive, but local storage based on RAID 10 is being used.

We have already transferred all these questions and suggestions to Parallels.

A little bit about how the PCS implementation process went.
We used solutions from Dell, the basis for the operation of our servers - PowerEDGE 720 based on Intel Xeon in a dual-processor configuration with a disk subsystem based on SAS hard drives.

We decided to dwell separately on ensuring the resiliency of hardware farms and therefore identified this stage. We started with an item that brought us many problems in the past - memory. Memory errors under conditions of a very large amount of RAM on the server (64Gb +) is a very frequent phenomenon, and therefore we chose a configuration with parity. The disk subsystem used a solution proven over the years based on RAID-10 with read and write caching (with BBU, of course). Optimization of the network subsystem consists in installing two separate network controllers and connecting them to different switches through channel aggregation technology in order to avoid downtime in case of failure of one of the switches or network cards.

When the hardware platform was ready, we began to deploy PCS on our equipment. The guys from Parallels gave us a large number of trial licenses specifically for the tests and ensured close interaction with technical experts, which greatly simplified the implementation work. The installation procedure itself differs little from the interactive installation of CentOS 6 and went without any difficulties (of course, it is possible to install via PXE, which we use to deploy new servers).

The PCS system itself is a set of components that can be delivered to one server or distributed to several servers. The main component here is Parallels Agent (it is installed directly on the server hardware and is an API for managing virtual containers, but does not have a graphical interface). For visual management, a separate component is used (set as a container) - Parallels Management Node (PVA MN), which needs to be delivered only once, and after that it can manage any number of servers with Parallels Agent (PA). That is, you can place a management node on the same server with client containers, or you can put it on a separate server. We are large hosters, it was more convenient and safer for us to put PA and PVA MN on different servers, and small companies can put both components on one server.

At the moment, PCS has been in industrial operation for almost 4 months, and we, despite the remarks, are happy with it, since the main thing is working:
- The density of containers turned out more than on OpenVZ and other platforms
- Accessibility of containers is higher due to better isolation of the FS and input-output
- Save RAM and disk space
- Daily incremental backups - customer data is secure, disk space is wasted less
- We spend less time on server maintenance, since PCS almost doesn’t require supervision
- Reduced number of appeals in support
- We can create new and new tariffs with a large amount of memory, disk and processor capacity
- Customers are satisfied. :-)

If you have questions - write in the comments.

Source: https://habr.com/ru/post/190524/

All Articles

FastVPS: How we changed the virtualization platforms

More articles: