Testing programs in difficult "weather conditions"
Hello, dear Habr! I am a participant in the development of automated control systems of high reliability, which are used in power plants, space centers, complex industries, etc. Once I faced the task of inventing a method for testing the performance of programs under the conditions of congestion of all kinds of hardware , namely:
CPU load
Network load sending / receiving
Shortage of RAM
Hard disk load with read / write requests
and also to think up a way of an estimation, on how many this or that program can create delays / hindrances, for work of other programs. In my opinion, the most interesting of these four - the first point, so it will be about him. Under the cat, two utilities are described, which I got and the principle of their work, as well as a couple of screenshots and video.
So we have a set of tested programs. It is necessary to develop a program that would check their stability when the processor is heavily loaded.
It would seem that it could be simpler:
whiletruedo x:=x+1-1;
here and loaded the car. But is it necessary to explain that one hundred percent of the processor load (from the task manager) is one hundred percent different. First, there may be several cores on the computer, and when loading one core, the other will work quietly. Secondly, even if all the processor cores are 100% occupied, and the cooling fan, spewing hot air, is about to send the computer into free flight around the office, there are no guarantees that this will have any impact on the work of the programs being tested, to speak about any visible lags of delays. And I want MouseClick to be three seconds after MouseDown, so that the timer with a period of a second would work after 5 seconds, so that the program's threads would really be “difficult” to synchronize, and that controlled chaos would occur.
Here it becomes clear that it is necessary to create a separate thread for each core / processor (class TThread). We find out the number of cores: ')
var info: TSystemInfo; ... GetSystemInfo(info); n:=info.dwNumberOfProcessors;
Next, create n threads in which some meaningless calculations are spinning. The operating system itself distributes them to different cores. In this case, it is necessary that the priority of the created threads randomly switch between at least two values. If this is not done, then with their low priority, the programs under test may not “feel” anything, and if they are too high, they may receive almost no CPU time. Since the priorities of several threads created by us can turn out to be critical, in order to guarantee switching from tpTimeCritical to something else, it is necessary for the critical thread itself to do this. In my program, when creating streams, I make one of them a “brigadier” who switches his priority and the priorities of all the others synchronously to himself (a good brigadier: he leads and works himself). In order to help the control flow decide when and to which priority to switch, when creating it, I give it the necessary parameters:
pr1 - priority value 1
pr2 - priority value 2
P1 - the probability of switching to priority 1
P2 - the probability of switching to priority 2
Tmax - the maximum time interval through which it is necessary to play the switching of priorities
It is clear that P1 + p2 = 100%; If we determine that pr1 is a critical priority, and pr2 is normal, then we get a kind of “controlled chaos”: a series of almost complete stops in the operation of a computer with a given distribution .
It all remains to somehow check and fix. So…
We wreak havoc and observe
I used something like a timeline:
According to the usual timer with an interval of 100 ms, the contents of this scale are shifted to the left by M pixels, where M is the number of hundreds of milliseconds elapsed since the previous clock cycle. The space left on the right is filled with red, and the rightmost column of pixels is filled with a shade of green. Thus, if there are no delays, i.e. M = 1, the scale content slides 1 pixel to the left every 100 ms, a green bar is drawn to the right. When significant delays appear, M is greater than 1 and we begin to observe red stripes, the width of which is proportional to the delay. The numbers to the right of the scale show the ratio of the delay time to the normal operation time for the last few seconds. Video recording illustrates the whole process:
In the "red zones" on the video you can see how the mouse is barely working. From myself I will add that the video is not shown: music, video is not played, programs are not launched, etc.
Well, now it remains only to make such a scale for each processor core separately, and you can observe, take screenshots, evaluate how this or that program arranges delays in the work of this or that core or even the entire processor. To do this, create as many threads as there are cores in the processor. We force the system to execute them on different kernels using the API:
SetThreadAffinityMask(handle,mask);
Each such process in memory keeps its scale and draws it in memory according to the above algorithm. In this case, you can increase the sensitivity to delays, reducing the time between cycles. Well, in the main program, on a separate timer, all these scales are displayed one above the other in a row. The result can be seen on video:
In order to assess how much the program under test makes delays in the work of other programs, in my particular case, several subjective assessments of the images of these scales in style were enough: “The scales are mostly green for all cores, so program A doesn’t interfere with other programs, and for program B, on the contrary, the scales are half red, which cannot but be alarming.”
However, it is worth noting that in order to conduct a numerical evaluation of the delays that a program makes in the system, it is necessary to identify such indicators as: the distribution function of the delay time, mat. waiting delays, dispersion delays. It is these indicators that will allow us to avoid subjective assessments, as well as allow us to formulate clear requirements for programs and algorithms. If a respected Habrasoobshchestvo finds such information useful - I am happy to make it a topic for the next topic.
I hope this text will be interesting to someone, and if it turns out to be necessary for someone, it will make me completely happy! ;-)
Ps. I'm new here. Honestly reviewed all sections of the site, did not find a more suitable place to post. I would be grateful for the tips.
UPD Thanks to habra people for links to similar solutions and tools. Here is a list of them: