Programming Olympiad, view from NSU. Article 2 - Testing System

I continue my series of articles about sports programming at NSU. Last time I talked about how tasks are written for tournaments, but now I want to tell you a little about the testing system.

The first article is about drawing up tasks.
The third article is about the work of the organizing committee.
The fourth article is about the tour itself.

The testing system is the holy of holies of any competition. The focus of the tournament nerves. In many ways, the success of the tour depends on it, its stable work can provide peace of mind for the organizers, and instability - an increased headache. Writing a testing system is a task worthy of a thesis (in my memory, 2 diplomas have already been defended on testing systems). And the writing is really good - and the whole candidate.

')
So, what the testing system usually consists of. In the case of the olympiads that were held in my hometown, from a computer with Turbo Pascal and grandmother-verifiers. At a higher level - everything is much more fun. A whole variegated software complex is struggling with the participants of the Olympiad. In some places - very cross-platform, and in some places - very specific. But all this together allows you to spend the tour well and beautifully.

The testing system, which is logical, has a server and client side. The server component in the case of NSUts is written in Perl. Actually, the choice of language was not very long. At first, the guys tried to optimize the ujudge (aka the Great and terrible Goplan), but after the 2008 Vshiba qualifying system had waved its legs in the air again, the word Ruby in the circle of NGUshny olympiadists was declared materschina. It was decided to revive the system of Zhenya Chetvertakova, which for a long time coped with its tasks, but was once replaced by a more beautiful and modern ujudge. So Perl asked for it.

The web interface is responsible for the Olympiad: it takes the decisions of the participants, saves them, sends them to the testing clients, receives the test results and processes them (builds a rating, etc.). Also, through the web interface, communication with the jury takes place during the tour, but I will tell you more about this in the article about the work of the organizing team during the tour.

The client part consists of the software package:
1. WinKill - run the program with resource constraints
2. Diff - symbol-by-symbol comparison of two files.
3. Noasm - search for assembly inserts
4. Estimator - a program designed to enroll points for tests
5. A set of bat-files for running compilers and programs
Each element can be talked about for a very long time, but they coherently check decisions and pass a verdict on whether the problem is solved correctly. In parallel, several testing clients may be running. Moreover, they can be run in parallel, not only on different machines, but also on the same machine with a multi-core processor. There can also be a lot of Web interfaces. All this diversity unites one thing - the MySQL database, which is often launched along with the Web interface.

Now that we have a clear overview of how the testing system is arranged, let us follow which path the solution written by the team should follow in order for it (the team) to get a well-deserved plus (or minus) rating.

Getting the decisions of the participants of the Olympiad

The decisions of the participants of the Olympiad are stored on the Web-server of the Olympiad. It is entrusted with the control of the maximum amount of the decision code so that too large decisions are stopped at the reception stage. The list of participants' decisions and their launch parameters are stored in a database.
The life cycle of the isolation medium begins with a request from the testing client to the database of the Olympiad server. In response to this request, the client must receive the id number stored in the base of the submitted solution, in case there are solutions in the database for verification. If there are no solutions, the request should be repeated after some time. A pause between requests is necessary in order not to unnecessarily load the database.

So, we have already eliminated too large sources. Why is this done? You can calculate ooooochen laborious solution for 5 minutes on your work computer and drive everything into one huge map in the source. And the solution itself will look like read (a), write (res [a]). We have already cut off such a trick.

Compilation

Before checking the solution, the tester should assemble its program code by the compiler defined in the parameters passed from the server. To meet security requirements, the testing client includes trimmed versions of compiler libraries. Potentially dangerous libraries for working with the operating system at a low level are excluded from compilers. The trimmed versions of the compiler libraries only make it difficult to access the network and WinAPI 32 functions, however, the applications under test have the ability to use libraries through assembler inserts. The noasm program is responsible for this. If assembly inclusion is found, the program code will not be compiled, and therefore, potentially malicious code will not be run, and the Olympiad server will be notified of a compilation error.

Fine. Now we already have a ready-made binaries, which is just waiting for him to throw a pack of delicious, fresh, juicy tests. This is required to be checked, otherwise a simple error in the program (or the evil intrigues of the participants who decided to win in an unsportsmanlike way) can knock out the testing system. In my memory, such audacious attempts once were thwarted by disqualification. Well, yes, this is not a competition among black hats =) And even more, most of the dirty tricks, like killer # define, have already been cut off by us at this stage, so there is very little room for outrages.

Running programs

WinApi 32 library is available for programmatic control of resource limitations and access control in Windows OS. It was decided to control access rights, computer resources and work process using WinApi 32. On Windows OS, a special user account with limited access rights is created to ensure secure run with the limitations of the compiled program. It has only one working directory with exclusive access rights.

The test bank (input and output data) is stored on the olympiad server. The testing client has its own local copy of the tests. The program should be tested on a complete test suite. To do this, the tester runs the compiled program on each test. Input data is transmitted to the application as an input.txt file placed in the working directory. The compiled program is launched using the WinKill program. It uses WinAPI to establish restrictions on the use of system functions and runs the program on behalf of an account with limited rights.

Control at the time of execution

The WinKill program starts the application and controls the WinApi tools with the limitations of the runtime environment (CPU time, memory, and the total program runtime).
It intercepts all exceptions and runtime errors, receives a return code when the application terminates. It notifies the testing module of all events that have occurred. If the application tries to go beyond the limits, it will be terminated immediately.
After the program is completed, it is required to compare the output of the decision of the participant in the Olympiad with the jury's answer. If the WinKill program, upon completion, returned a zero return code, then the communication module starts the specialized checker program. If such a program is not provided by the condition of the problem, then the output is checked using the standard diff program.

Return to the original state.

The testing client must return all parameters to their original state in order for each application to be launched in equal conditions: with the same environmental parameters. The working directory of the application will be cleared: the output data, the program code, the compiled code and other files will be deleted. The testing system does not take additional actions to clean up the RAM, since the application's memory is automatically cleared by the operating system when the process ends, and it frees up the resources occupied by the application. After that, the tester will be ready to start the next test from the set.

Well, that's all for the testing system. You can see the working version here: olimpic.nsu.ru/tester/nsuts_olymp.cgi . I want to thank you very much for helping to write an article to my friend Sasha Kirov, one of the authors of NSUts. 70 percent of the text this time - quotes from his thesis.

In the next article, we will try to tell Natasha Popova how the organizational part of the Olympiad takes place, because the All-Siberian Olympiad is not only ~~valuable fur of~~ many cool coders, but also a fairly serious international project, where every little thing needs to be thought out and organized.

Source: https://habr.com/ru/post/63005/

All Articles