Under the hood of the quest for programmers

At the moment we are working on a new project - the development of an online quest , the essence of which is that programmers, divided into two camps (ITshniki and FSB), would solve a variety of tasks in the field of programming in their favorite language, while passing through some interesting story by reading a colorful comic book with an exciting script. Immediately make a reservation that there are no restrictions on programming languages. The only limitation: they may simply not be installed on servers due to their rarity, for example. Today we would like to talk about the problems that we met in the development of the server part.

The main problem of the project was to ensure the security of the servers during the game, because we will allow to run in our own, in fact, extraneous code without any restrictions on access to resources. Those. when launching the programs, we need to set up the environment in such a way so that the spent program (which may in fact be an installer of the virus) does not break the server and is deleted upon completion of work. When solving a problem, it is necessary to form a list of the points of which it consists Our list looks like this:

We have a server on which some software from an external source will spin, and which, in fact, does not know what it will do while it is working;
There are, in perspective, a large number of users who will upload a large number of applications to a server. It is quite possible that some of them may even turn out to be viruses: we will allow you to upload everything from php to exe files. The problem is similar, but a bit wider: the program can remain working on the server resident;
Starting, the program will eat away a huge amount of resources.

Then, we started looking for solutions.

')

As you understand, this is not an option under such conditions, since The server will definitely fall, it's a matter of time. When someone writes a program with some terrible bug, or when someone floods the virus. Here, by the way, there is another problem: you can spoil the results of rivals, using them as your own.

If Linux were our platform, this would be a solution. It would be a good and sustainable solution that would suit everyone in terms of cost and time of creation. However, historically, our group is .NET developers and this system is alien to us. Therefore, in order not to undermine the security of the server by not knowing the details, it was decided to stop on the Windows platform.
You can create a sandbox in the Windows system. An entire study was conducted on this issue. For example, you can limit the process by the amount of RAM allocated to them. Or according to the amount of disk memory that can be occupied on the file system. You can limit the process in almost everything, however, forgive me Bill Gates, there are too many viruses in Windows OS, which means that all these obstacles can be removed, for example, with a virus. Therefore, in the middle of the development of this version, it was decided to abandon it after all. Despite the fact that in essence it would be a cool solution: there are no problems collecting the results of the program, and there are no problems cleaning up the program’s activities. Windows kernel objects, called Jobs, put restrictions on the use of OS resources, and NTFS rules put restrictions on working in the file system. Practically what is necessary, but the feeling of self-preservation prompted that there are still options that can be made fairly quickly.

In fact, this option lies on the surface as the first thing that comes to mind. There is a virtual machine on which you need to secure the necessary application and files. Next, the application is launched, its output is collected in a file, and after completion of work - the file is taken from the virtual machine, which in turn is reset to its original state. This solution has a number of advantages that make it advantageous in all respects:

The state of the machine is reset, which means that there, inside, even the apocalypse, even the demolition of the operating system from the hard drives. Absolutely no matter what happens. At the end of the work (or by timeout), just take and roll back to snapshot.
The environment does not break after the previous machine runs.
You can control every little thing.
However, there is one drawback: complex management.

At first, we thought that inside the machine we would have to launch an agent that would manage the deployment, launch, and fence of the output program for subsequent transfer to the host machine. However, later, as it turned out, there is a wonderful product: VirtualBox, which has a rich API and an agent that I need so much. Why I was so pleased with the presence of the finished agent? The more components we launch from scratch, the more bugs we will say. Not because we are so inexperienced, no. Just bugs are always there, and even more so in newly written software. The code debugged by hundreds of thousands of launches is much better than samopisny.
For the same reason, we abandoned one risky step: in order to save on resources we could use it as a guest operating system ReactOS. Really could be. If it were not for one BUT. Some interpreters simply did not start there. In no way. Maybe this OS has only one error, and it is located deep in the core. But its presence affects half of the applications and creates a feeling of instability of what is happening. If it were not for this, we would choose it, if only for the following reasons:

It occupies relatively nothing on the hard disk (~ 40Mb)
It almost does not take RAM (~ 40 MB)
It almost does not take on CPU resources

This makes it ideal for the case when you need to run up to 20 virtual machines on one physical machine. This smart operating system will be perfect when it is carefully finished to a bezless condition. Lightweight Windows, the launch time of which is in seconds (~ 5).
So, the final decision:

Hosting is cloudy. It has an API, you can dynamically, programmatically, to increase the nodes, if thousands of people come to the server and start playing. Also, you can dynamically disable them.
Noda rises from the pattern. In the template - Windows, Git, VirtualBox and script. The script downloads the latest version of executable files from the repository and starts the service.
The service starts, picks up VirtualBox, searches for a sample virtual machine and, depending on the configuration, clones it in several instances. The machine is pre-configured with PHP, Perl, Python, Ruby, Lua interpreters, as well as Java, .Net Framework 3.5 versions. Windows 7 64bit.
Starts to listen to a queue of commands.
When a team arrives, it is parsed, onto a free Wirth machine, a script or EXE file of the player is uploaded, is launched. The solution of the problem takes 30 seconds. After the time expires, or when the process is completed, the output.log file to which the program stream is written OUTPUT is copied to the host machine and the virtual machine is reset to its original state, rolling back to the last snapshot.
The results go to the browser.

At the moment, we can talk about the following results of our work:

The only thing we don’t like is the size of the virtual machine image. At a workstation, cloning a 4 GB image into five nodes takes about 20-30 minutes. When fixing a bug, it would be a nightmare. Therefore, the presence of a ready agent on the side of the guest virtual machine and the presence of stable versions of interpreters is a separate fat plus. This to some extent guarantees the uninterrupted operation of the guest machine. How will behave in the cloud - we will soon load and find out;
The total time that takes all the way from the browser to the virtual machine, starting, waiting for the end and back - in the form of logs (taking into account the rollback of the guest machine to the initial state), takes 4 seconds when loading a maximum of 20% CPU. This is a great result. Those. when starting the server on which 5 Wirth machines will spin, we will be able to process up to 75 task checks per minute. That is, we can assume that you can get rid of one server. The second will work, just in case. And the third, if one of the two turns off (and suddenly!).
We did not begin to validate postal addresses for a subscription for the simple reason: first, if they enter the left-hand addresses, the mail will still come to people. Secondly, why send spam about the fact that you, like, signed up, if you can just write it on the screen? And third: why make people climb into the mailbox to delete this spam? :)

Next time we will tell you in more detail about the technical part, let's look even deeper. In the meantime, welcome to the game site (and do not forget to write in the comments, in what language would you like to solve the problems?):

Source: https://habr.com/ru/post/151529/

All Articles

Under the hood of the quest for programmers

More articles: