Our architect of server virtualization department Pavel Emelyanov gave an interview to System Administrator magazine. We decided to publish his interview here, in which he spoke about the CRIU project, how the development team works with the Linux community and with Linus Torvalds, and about the changes that may occur in the field of virtualization in the coming years.
Please tell us what you do in Parallels?I am working on a Parallels main server product called Cloud Server. It is a Linux distribution kit specially prepared for running virtual machines and containers, which is neatly integrated with a large number of additional applications from Parallels - cluster management, distributed storage, web panels and the like. This multi-component product, its constituent parts are sometimes very complex systems. I am committed to designing many of these systems, mainly related to the core, their internal structure, operation logic and interaction with each other.
In addition, I am organizing the process of interaction between the Parallels nuclear team and the Linux kernel community. The fact is that Parallels is actively involved in the development of container virtualization subsystems in Linux, almost all of our experts are doing something for the Linus Torvalds core. I follow this activity and try to direct it to the right direction for the company.
Today my favorite activity is the development of the CRIU project, which is derived from our work with the Linux community. This is an application that is able to remove the state of processes running in Linux and restore these processes from obtained data in another place or at another time.
')
How did the CRIU project come about? What exactly prompted its creation?The project emerged as a solution for integrating Parallels container code into the Linux kernel. And the integration itself is also an interesting story that I have already mentioned.
A few years after the launch of the OpenVZ project in Parallels, they realized that it would be very difficult to support all the changes that were made in the kernel.
We decided to start porting our container kernel code to the Linux kernel and send these changes to the community asking them to accept them. Then I was just a developer, and the choice of "who will send" fell on me. I started porting our code to a different kernel, sending these changes and talking to people from the community to accept new functionality.
For a couple of years, we managed to integrate about half of all the nuclear functionality that Parallels had, but we did not take the code that dealt with the “live migration” of containers into the Linux kernel. Moreover, not only we tried, other companies wanted to have similar functionality, but the community didn’t come to a single decision. Then I decided to try to make the required functionality not in the kernel, but in the form of an application, expanding the kernel as needed.
The first prototype was written in about three days at home while I was sitting on the sick-list. I sent him "to see" in the community. People looked and ... they decided that they took the small changes in the kernel that I needed (these changes finally settled in the Linux kernel after a couple of months).
After this, the project began to develop more actively, since I had the confidence that I would definitely take on additional changes.
Who uses CRIU and where?As for applicability, the project now has a “transitional age”. On the one hand, different companies have a great interest in the project. For example, people from Samsung, Huawei and even recently from Google itself sent me corrections and additions. And once sent, it means that they are trying to do something with it. But, on the other hand, active use is not yet observed, and it is clear why.
The project is quite complicated, and we have not yet managed to stabilize it, so it’s impossible to use it seriously yet.
How many developers are working on a CRIU project?Complex issue. So far, three of them are actively and regularly doing something with the project, including me. There are still a dozen people who "help", occasionally sending bug reports, sometimes with corrections, asking if the project can do this and that, thus throwing us new ideas. But also this side - we periodically change something in the core and send these changes to the community, thus implicitly involving people from it in the process.
What can we expect new in CRIU in the next releases? What functionality is planned to be added to the project in perspective?Oh, in the near future, the most important event that will happen with CRIU is that all the necessary nuclear support will come out with the kernel 3.11. If someone wants to use CRIU “in full”, he will not have to compile and install a non-standard kernel. In addition, according to secret intelligence reports, the next version of the Red Hat distribution under the name RHEL7 will be based on 3.11, that is, it is likely that CRIU will immediately work on one of the most popular Linux distributions.
And what will happen with the project further is very interesting for me. The fact is that the opportunities CRIU is able to provide go beyond the interests of Parallels. But who and when will they be implemented? I have plans to create a community around CRIU, similar to what is built around the Linux kernel, so that the project is developed not by several people from Parallels, but by a more diverse and numerous development team.
You said that all the necessary support for CRIU will be in kernel 3.11. And what is missing today in the core for full support of CRIU?If we take the freshest core for today 3.10, then there are no two things.
The first, the most important, is a subsystem that allows the CRIU to track which parts of the memory the process changes, so that when the state is removed again, not to copy all its memory, but only to copy what has changed. This is not a necessary component for CRIU, but it allows you to significantly speed up the process, for example, live migration. Sometimes this is very important, because without this optimization, you can, for example, migrate the process for so long that its network connections will break.
The second part is slightly smaller, but in some sense it is much more important than the first. This is a small extension to the process debugging subsystem that allows you to control how processes handle signals. Without it, it is impossible to remove a state from processes that are in a stopped state.
It sounds scary, but, fortunately, such situations are relatively rare, so that until now CRIU did without it.
What projects related to the Linux kernel, besides CRIU, are you working on? What non-Linux projects are you taking part in?Seriously almost over any and in no. What I have to do in OpenVZ, CRIU, and our integration with the Linux kernel is more than enough.
You mentioned the project OpenVZ. Tell me, what is this project? What is your degree of participation in it?From the point of view of software, this is the core from Parallels + several utilities for its configuration, with which you can create and manage Linux containers.
In addition to software, it is also the (already) brand with which Parallels declares itself in the world of Open Source. Around the project has even grown a medium-sized community.
The degree of my participation in it, apparently, rather big. Shortly after the project was born, I was “promoted” from a simple developer to the leader of the nuclear team. And since the main thing that was valuable at that time in OpenVZ is the core, I de facto also became something of a technical project leader. And a year later, the mentioned campaign for the integration of OpenVZ code into the Linux kernel, which was also conducted under the flag of OpenVZ and initially fell on me personally, started.
Now the kernel is not the core value of the project (since we have already integrated a lot into the Linux kernel, and, for example, on Fedora 19, you can make containers without our kernel). So, I, together with the project manager (Cyrus Kolyshkin), are engaged in the fact that I think up how to move the project further, without “leaving” on the core alone.
One of these areas is, for example, our CRIU, which is positioned as a subproject of OpenVZ.
What are the strengths and weaknesses of the OpenVZ project?The strong point is that this project, in comparison with its competitors, was aware of the need to integrate with other projects on time and started from the core - from the core.
The weak point is that it is very strongly connected with the closed commercial product of Parallels, and because of the inevitable conflicts of interests, decisions are not always made in favor of OpenVZ. But Parallels is well aware of this and is trying to rectify the situation.
What are the difficulties in integrating OpenVZ code into the Linux kernel? What version of the kernel did the integration start with?I already forgot when we sent the first patches. I do not even remember which subsystem we started with. I only remember that in 2.6.18 the kernel, on the basis of which we made a stable branch, we already had part of the code integrated.
In my opinion, these were the namespaces (namespaces) of the PID and SysVIPC. By 2.6.32, on which we had the next stable version, network virtualization (net namespaces) and something else were integrated.
Difficulties arose in the fact that I had to "on the fly" to master a completely different way of development. We roughly thought about how the patch integration process looks like in the kernel. But when, in response to my first set of patches, people began to actively discuss how to do it altogether from scratch, or to suggest, before doing what I did, first rewrite a good piece of the process control subsystem, I was even confused.
How does the OpenVZ project relate to LXC?And this is generally a great question. The OpenVZ manager has long and hopelessly wanted the answer to be printed in huge letters and hung in some incredibly popular place where everyone could read it!
First of all, what is LXC? LXC is, firstly, a set of core components that allow you to isolate processes from each other, and, secondly, it is a utility that forces these components to interact in the right way, creating a container. OpenVZ is almost the same: nuclear components that are implemented differently, and utilities that these components configure to create a container, too, but safer and more functional.
So the answer to the question is this: the Parallels nuclear team has created most of the nuclear functionality of LXC. All the functionality that appears in the main core (i.e., de facto in LXC), necessarily begins to be used by the OpenVZ project. As soon as we make and release our kernel based on a newer version of the Linux kernel, we throw out our implementation of the subsystem and replace it with a similar one, which we (usually we did) wrote for Linux. That is, the further, the more parts of our nuclear code change the brand from OpenVZ to LXC.
What functionality is planned to be implemented in OpenVZ in the near future? What is in perspective? Are there any other subprojects for OpenVZ planned, related or not related to the Linux kernel?Now we are working on several things. This is almost a completed virtual disk driver for containers, the so-called loop device for those who work with Linux. He is there, but we are not satisfied with how it works and what opportunities it provides. More - optimization of the memory management subsystem of the process groups (memcg).
The way it works now does not suit us in terms of performance and the allocation of resources between containers. And one more optimization - for the FUSE file system data cache. The way it is implemented now does not allow maximum performance to be achieved on many types of loads. And all this is done immediately with an eye on integration into the Linux tree.
There are also big plans for integrating OpenVZ with other projects, for example, it is almost all available as part of Fedora 19. People from OpenStack look favorably on the idea of ​​supporting OpenVZ in their services. Integration with CRIU - from the same series.
There are some more interesting directions which, by the way, quite “pull” on a title of the subproject. But about them, I, unfortunately, can not yet tell.
What new can you expect in the coming years in the field of virtualization?Before answering this question, I want to make a small clarification. Usually, the term “virtualization” refers to the technology of creating virtual machines, that is, hardware virtualization. Containers are in a general sense also virtualization, just at a different level, but, as a rule, they do not call it virtualization, but they say so - containers.
So. It seems to me that in a few years, virtual machines will differ in performance (for the worse, of course) from the real ones, so that this minus will no longer outweigh the advantages that virtualization provides. As a result, a virtual machine on any system will become not an option, but an integral component, and so naturally integrated into it that users will not even wonder if they are working with a real or virtual system.
The same thing will happen to containers, but on a different “front”. Now containers can be used as a shell and for a separate process, application, and for the whole image of the operating system (without kernel). In the future, the use of containers to create a virtual operating system should come to naught (virtual machines will fill this niche). And the use of containers to “isolate” services, both local and cloud, should become the invisible standard present.
How was the development of the kernel developer? What are the difficulties at the initial stage?It all started, it seems, in the fourth year of the institute. I then worked on an academic distributed file system, and my supervisor said that Parallels (then the company was also called SWsoft) got together to make a commercial project out of this system, and sent to talk with my graduate student to join them. This post-graduate student worked in the Linux kernel team, they interviewed me and took the team with the words: "We'll deal with this file system later, start with the kernel for now." As a result, Parallels’s hands never reached the system, so I remained a nuclear developer.
The real difficulties arose when I had to write code not for the kernel that was developed inside Parallels, but for the core, which Linus Torvalds does, and this was due to the fact that the code development models in Parallels and in the community were completely different.
In Parallels, this is more or less standard commercial code development. Working in a community is a completely different process. Its main feature, from my point of view, is that it ... is not quite the software development process. This is mainly the communication of people who are interested in creating a large and complex program, and the actual development there is in second place. Awareness of this fact and adjustment to it were the main difficulties.
What kernel subsystems are you working on?Virtually all but drivers. In one sense, this is one of the problems - it is impossible to plunge deeply and thoroughly into any of the systems, one has to constantly monitor the development of all. There are, however, "favorite" systems - for me this is a network code and a memory management subsystem.
How is the process of incorporating your patches into the kernel? What are the features of this process? What are the difficulties?The process looks the same to everyone. First you need to make a patch or a series of them, write comments to each (this, by the way, is a separate skill). Then the patch is sent to the mailing list, which discusses the correct kernel subsystem. The “copy” is a person who supports the subsystem (it is called the maintainer in English), and people who understand it and can help assess the quality of work. Then you have to wait a bit. If everything is done well and everybody likes everything, the maintainer patch takes into its repository. Then Linus includes accumulated at all maintainer in his tree.
This is the easiest way, but it can get complicated and drag out. If the change is large and complex, or the person who made it is inexperienced, then the discussion can begin after the patch is sent to the list. People who understand the relevant code may find errors in the patch, may say that the solution to the problem must be different, or may be asked to make cosmetic changes, for example, rename the variable or write a more detailed comment. After this, the process must begin anew. The most important thing for the author of the patch here is not to take criticism into your account, but to grasp its essence and, if you agree with everything, redo it.
For example, when I started sending parts of a container code to a mailing list, I had to rewrite the code from scratch several times. As a result, the fact that Linus now has “about containers” is not a single line similar to what we have, although it solves the same problems.
Are there standards for code design for inclusion in the Linux kernel? If so, could you give specific examples? Is there a deviation from the standards?Standards are definitely there. Even a special document is written - Kernel Coding Style. It contains all the requirements for registration. The requirements, I must say, are very reasonable. They are easy to get used to, and then it is very difficult to start writing (and reading) differently. , , , , , , , .
Linux?Sometimes. . , , , -.
, . , «» , , , .
Linux? ?Linux . , Unix- Linux. - Windows. , , , Microsoft Word, — LaTeX. .
, , AspLinux, « », . Centos, Suse, Fedora.
? , ?… , . — «», «», «», «», «», , , - , «».
? (GNOME, KDE, XFCE ..), (, -/ ), -, , (, git)?«» ( ) Fedora Linux. Parallels, . KDE Gnome , FluxBox. Thunderbird, Chrome, Skype. . OpenOffice .
( ) .
VIM, — make, git, svn, gcc, objdump, gdb.
? , ?google+ hangouts, . « », « ».
What does a workplace look like? Are there any special preferences in the choice of hardware?Desk, chair, laptop. The hardware preference is only one - the IBM ThinkPad has a very comfortable keyboard and trackpoint.Do you use mobile devices for work purposes? If so, which one do you prefer?() — , HTC Sensation, . . . « ». , Linux , , , « ».
? ?, - Linux-. , — « ». , e-mail «».
, ?. , .
, . ? ?, «». - , , . , , , , . .
, , — , , --, - , .
? « » ?, . , , Linux — « » — . .
, , — . .
, ?, , — , .
«» .
kernelnewbies.org — , , , .
1982 . 2004- SWsoft, 2007 Parallels. 2005- - . 2008 « », - « , ».
, Linux Parallels, OpenVZ CRIU. Parallels Linux.
, « »