Load testing of Web-systems. We continue training

The article describes a number of points (number of connections and execution sequence, third-party resources in the script, grouping of requests) that should be paid attention to in preparing for the test with a high load on the Web-system with a web-interface.

I propose to consider the following script configuration elements that may affect the performance of your test.

I want to describe why the use of real HTTP methods is more important than its “fast” counterparts. We will touch on the need to use checks of the received data and the construction of regular expressions to get the values.

Getting answers with real data size

In my opinion, at least in the final testing of the Web-system, it is necessary to use the requests that the browser sends. If it was a GET request, then we should simulate only it and not replace it with, for example, HEAD. The main difference between these methods is that GET receives the contents of the response, but HEAD does not. It would seem, why should I get unnecessary data, for example, from pictures, css, fonts, but as practice shows, they are no less important.
')
Compare two types of queries for the same test resource.
HEAD

Get

In the pictures you can see that the GET request is executed many times longer than HEAD. Consequently, the server gave the data to this request longer and could not serve the following. With one user, this difference does not seem to be significant, but with 1,000 virtual users, the server will waste time no longer processing each of them.

For example, our web server is configured in such a way that it can handle only 100 simultaneous connections. As a result, we will see that the first 100 are connected and will work for at least 3 seconds. In this case, all the other 900 will be waiting for connectivity As soon as someone from the first hundred finishes, he will give the resources to the next one. That is, the thousandth user in our example will be able to start working with the specified request only after approximately 27 seconds. If we used the HEAD method, the thousandth user would get access to the system in 2 seconds. (these calculations are extremely rough)

As a result, we see how using the "correct" method to access the server shows the real load on it.

The example used in the pictures is fully synthetic to show greater visibility in the query execution time. You may not have such large requests. But even with a response of 100 to 200 kilobytes and the use of 5,000 users, we can observe significant slowdowns in the operation of the Web server.

Verification of received data

Many will say that verification of the received data is not load testing, but functional, and you will be right in part. In practice, there are situations when the functional part of the Web-system does not start to work correctly under a heavy load. For example, a web application may not properly process incoming data under heavy load.

I was faced with situations when the web-system reported on the success of the request, sending a response 200 OK. But the request body was empty, after a deep study of the system, we managed to find out that this was a planned response. That is, the success of the query could only be determined by the presence of content in the response.

It turns out that the only option to check the correct operation of the system under high load is to control the received data, and not just the status of receiving a response.

Today, with the intensive development of dynamic work with web-systems (AJAX, WebSocket, Flash, Java, etc.), we can receive different content for the same request. And you need to be sure that the answer text is correct.

Building “regular” regular expressions

In many utilities, to perform load testing, in order to verify the correctness of the data obtained, it is necessary to use regular expressions. We all know what it is, but what are “regular” regular expressions. This expression takes as little time as possible to find the source value.

There are a lot of examples and articles on the Internet about the poor performance of certain regular expressions or engines on which they are used. They describe what lazy, greedy, and super greedy quantifiers are. Why and when it is necessary to use groupings and alternations. How to perform instantiation to increase the speed of the regular expression engine. I think, to whom it is important and interesting, they will be able to find information.

I want to demonstrate why you need to build them correctly. Take a certain sequence of requests, which at the time of recording had some kind of delay.

On the picture, delays are indicated by green arrows. When executing the script, the utility must simulate the delay between requests, thereby guaranteeing the “real” user load.

From the previous abazatsa we found out that it is necessary to perform control of the received data. We do this with regular expressions. As a result, for part of the requests, we will compose the expressions and begin the check.

Performed checks should spend time searching for data. As a result, we get about the following

The red arrows show some time that will be spent on the work of regular expressions.

As a result, we see that the overall runtime of the script has already deviated a little about how, as if one user had done it. If we consider that each virtual user must do such a check, and we have 1,000 of such running on the same computer, the processing time will increase several times. Each virtual user will capture the resources of the operating system for calculations. Therefore, while resources are occupied by some users, others cannot access them.

The more accurately we have a regular expression, the less time is spent on its processing, and we get a more real load on the system with a large number of virtual users

Of course, for many, the use of regular expressions, and even more so their “correct” version is not required. But if you want to bring your test closer to the more realistic conditions of execution by virtual users, then you must not forget about the speed of the service functions.

Conclusion

We considered three possible options that may affect the distortion of load testing results. Of course, not all of the situations described will affect parameters such as server connection time, server response time, and other characteristics you receive.

But in most cases, we are interested in how many real users execute a particular scenario, at what speed the server will respond to us, and how much time each user will spend on executing this scenario. But on these parameters can look at the points reviewed.

The server's response time to a request from you can be several milliseconds. With a larger volume of data received and processing them, the execution time of the entire script can be delayed for several minutes, which is sometimes not acceptable for customers of such a test.

Update: we finish the preparation in the article

Source: https://habr.com/ru/post/333504/

All Articles

Load testing of Web-systems. We continue training

Getting answers with real data size

Verification of received data

Building “regular” regular expressions

Conclusion

More articles: