Performance testing: pitfalls

I am engaged in the creation of high-loaded applications for stock trading. Loaded both in data volume and in the number of users and requests. Naturally, for such applications, performance is of paramount importance, and, as a result, testing thereof.

Observing this testing from the side, I have accumulated a certain amount of information, which, I think, will be interesting.

Stone 1st. Conversion factor

Testing of such applications requires the deployment of a whole network of test machines. As a result, this leads to the formation of a test cluster. Attempting to deploy such a cluster based on machines that are physically located in the development center leads to the creation of its own data center with all the costs of such a solution. A good alternative was to use solutions like Amazon Web Services.

The natural desire to save on the rental of hosts or on the purchase of equipment leads to the choice of those with lowered relative to the production installation characteristics. Understand at times. And here the conversion factor between synthetic performance indices comes into effect. Those. the processor in production will be 2 times faster, the number of cores will be 4 times more, the amount of RAM will be 6 times more, the performance of the disk subsystem will be 3.5 times better, the network speed will be 100 times more. We add, divide by the number of indicators, multiply by a certain correction factor and obtain the conversion factor, which we will multiply the results of performance testing. You can come up with a more complex formula, for example, assign a certain weight to each indicator.
')
Upon closer examination, this approach is only suitable for preparing test suites for future testing on installations close to production, and to detect the most obvious performance problems. (Which is already quite a lot and important.) Why? Yes, at least because with this approach the effect of bottlenecks is completely ignored.

An example from life. Tests were run on one host, the application under test was run on another. The test suite included requests that give a different amount of data in the response. Queries that give relatively little data gave satisfactory results, while queries that give large volume responses yielded unsatisfactory results. At the same time, the host of the application under test was far from overload. The conclusion suggests itself: the application under test does not cope well with requests that give a large return, it does not use all the resources of the machine, needs redesign. And what really? But in fact, it turned out that the low speed of the network led to a long transfer of answers, which especially affected the large volume responses and, thus, simply did not allow creating a large load on the application under test.

Here is an example of bottleneck, which appeared on the test (slow) installation, and simply impossible on production. Let's call it the “downstream” bottleneck. Anyway, the “downstream” bottleneck is not dangerous and causes only an unproductive waste of time for the tester and developer.

But can it be the opposite, is it possible to have an “ascending” bottleneck, which is really dangerous and can cause great trouble not only to the developer, tester, etc., but also to the client? Those. Let us imagine that the indicators achieved at the test installation fully satisfy us. For example, the conversion factor is 5, we need to ensure 100,000 operations per second, the test installation gave 25,000. It seems that everything is OK, can you sleep well? No matter how wrong! Similarly, in such a situation, a bottleneck may appear, undetected (and fundamentally undetectable!) In a test installation. Because of which the real, effective conversion factor will be not 5, but 3. Ie not 125,000, but only 75,000 - 25% worse than necessary.

And for the “ascending” bottlenecks, a huge scope of possibilities opens up: we form the coefficient of recalculation of productivity on the basis of synthetic indices, in which the weight of individual indicators is chosen almost arbitrarily. It is not enough to estimate the weight of one of the indicators, and ...

Stone 2nd. Testing application

Performance testing requires load creation. Load from multiple hosts, from multiple user accounts, sending a large number of different requests. There is a natural desire to automate all this. Make your own utility or find a ready, which, fortunately, quite a lot.

But, creating a large load on the application under test, the test application itself is under load. This is not entirely obvious, but nonetheless it is. Let us turn to the analogy. We all know that any measuring device acts on the process being measured and, therefore, inevitably introduces an error. Similarly, the features of test implementation affect the test results, which is not always taken into account. For testers, it is natural to look at the application under test first (and any), and not at the tests themselves. I am by no means mean trivial test errors. I mean the unobvious impact of tests on the testing process.

For example. A testing application sends thousands of requests per second. It turns out that each next request is slower than the previous one. Well, the system does not withstand the load, the delays increase, the queue of processing requests grows? And it turns out that a testing application (in Java) allocates a certain amount of memory for each request-response, and the more requests and responses received, the slower the memory is allocated in it, the fewer requests a testing application can send per unit of time.

Stone 3rd. Black box

The most common approach to testing is the presentation of the application under test as a black box. Testers are not included in the implementation details, do not take into account the features and interrelationships of individual components of the application.

This approach is good in general. Moreover, the necessary initial conditions of this general case include a large (almost unlimited) amount of time for testing. That almost never occurs in reality. On the contrary, testers in most cases lag behind developers. Because at least, that the developers are endlessly reworking something.

In such circumstances, it is important to minimize the time required for testing, while not reducing its quality, of course. To solve this problem, close interaction between testers and developers is needed, during which developers, naturally, who know their application, can pre-specify testers' weak points, possible bottlenecks, etc. At the same time, of course, the approach to the tested application is lost, as if to a black box. But, firstly, it is justified, and secondly, it is not necessary to do all the tests in collaboration with the developers.

The article describes 3 pitfalls. Perhaps someone would like to talk about other unobvious performance testing issues. Waiting for your feedback!

UPD. An article about non-obvious, non-surface methodological errors in performance testing. If you have a burning desire to stigmatize and condemn these mistakes - you should not, I agree with you in advance. It would be more interesting if you shared other types of such errors.

Source: https://habr.com/ru/post/168651/

All Articles

Performance testing: pitfalls

Stone 1st. Conversion factor

Stone 2nd. Testing application

Stone 3rd. Black box

More articles: