Load testing: where to start and where to look

You probably know that there is a big difference between how your application / service will work depending on how many users use it. What worked during development could fall apart when the first real users arrived with their surroundings, and what worked with a hundred users could die when there were 10 thousand of them. Or it happens that you all tested on artificial data, and then your base starts to slow down because of a user named İnari.

How bugs survive, when load tests are included in a project, where to get data for them and whether it is possible not to test at all by dumping the results into production immediately, we talked with Alexey Lavrenyuk (Yandex) and Vladimir Sitnikov (Netcracker) .

A few words about yourself and your work. How is your job related to testing?

Alexey Lavrenyuk: I am a developer at Yandex, in the load testing service. I do tools and services for performance testing. You can look at our open-source tools for load testing - Yandex.Tank and Pandora , and on our service for load testing - Overload. Now open access to its beta version .

Previously, we were able to test only the performance of server applications, now we are mastering mobile applications - this is a new trend. For example, we test the power consumption of phones. Read about it here .

Vladimir Sitnikov: I have been working for Netcracker for 12 years. Now I take the position of a performance engineer, i.e. I do not test products, but I observe how the system behaves in actual operation and in tests. My responsibility is to analyze how certain solutions in the field of development and design affect the final result. Therefore, I often do not just look at the test results, but also plan it. This mainly refers to load testing.

Once I myself was engaged in testing, but now it is in the past.

- Let's talk a little about theory. Is it correct to say that problems at the testing stage appear because of developers who do not take into account any parameters of the problem or use an inappropriate solution?

Alexey Lavrenyuk: It is impossible to foresee everything in advance. In a typical service, there are a lot of spins: to solve product problems, you can choose different algorithms, select the parameters of these algorithms, use libraries, frameworks, production settings in production, and hardware parameters. In combination, these twisters give billions of combinations, and without a deep understanding of how the service works, you can sort them out for a very long time. And this understanding cannot be obtained if you do not have a tool for conducting experiments and analyzing their results.

Load testing is not a club that hits developers on the head when they write non-productive code. This is a very powerful measuring device, in fact, an oscilloscope for an electronics engineer, allowing you to probe the code and find bottlenecks in it. And then, after optimization, see (and demonstrate) the result in numbers and graphs.

Vladimir Sitnikov: I would like to add. “Where do bugs come from at all” is a millennium issue. And it is closely related to another interesting question - why, despite all the testing, bugs live to real systems and only appear there. Why during testing we do not find them? Do we incorrectly set the task?

In my opinion, this is due to the fact that the development and support teams are different. Some people make a decision and completely different - they are engaged in its support during the operation.

Half of all problems in production is a combination of 2-3 stupid mistakes or design assumptions: someone created the wrong code, used a stupid algorithm, etc. Each such error in itself does not affect the performance - modern iron is quite powerful, it digests a lot of things. But together two or three mistakes “shoot out”. And because of the separation of development and support, the authors of these errors will not know about them.

Rarely when, in fact, they search for the authors of the code and say: “Please, you don’t have to write like that anymore. But in order to be guaranteed not to make such mistakes in the future, the developer needs to get this experience. With the support of its own system, conclusions from the "stuffed cones" would be made faster.

- Does it make sense in early load testing (at the development stage)?

Alexey Lavrenyuk: The most expensive errors are obtained when load testing was dragged by the ears to the finished project (I'm not talking about cases when the code is tested in production). The code was written, it works functionally, but it turns out that where 10 answers per second are required (only), the service can digest only one, and even that with a creak. Even worse, if the foundation is to blame - the framework that was chosen because “everyone uses it” or “well, it's so new, cool.” That is, you have to rewrite everything at all.

The sooner problems are caught, the easier it is to solve them.

- That is, we are starting to develop and simultaneously launch testing?

Vladimir Sitnikov: Yes and no. Sometimes, at an early stage, you have to look quite roughly at what is happening. This may make sense. But, as a rule, the code does not work at first. And measuring the performance of a code that, say, returns the wrong answer, is a bad job. We spend time, resources (including machine), and the result is unknown.

However, the launch of testing is quite a long history. Before measuring something, you need to understand what it is to measure and on what data, whether we will run out of this data during the test, how we will resume them, what metrics we will monitor, what expectations we have regarding the results and t .P. In a very small project, all this can be done in a day or two, simply by discussing and making the necessary decision. But for a more or less large project, such issues are not solved in one day, especially since close interaction with the customer is required.

There are cases when the customer comes with certain requirements from the very beginning, but more often they have to be formulated only at the “connection” stage of load testing.

Therefore, thinking about load testing is from the very beginning. And I would not consider load testing as an end product. This is a process that must be followed to avoid certain mistakes. As the solution is created (in the process of transition directly from development to testing and output to production), the purpose of load testing also changes. First, it is used for debugging, then - for searching errors, and at the end - as a criterion that the solution is ready for production.

Here it seems to me appropriate analogy with the usual testing. When should it be included? Not at the end of development. That testers will come at the end of development and fix everything is a myth. Testing does not fix, but looks for errors. And load testing is a tool for finding errors of a certain kind. However, unlike conventional testing, which essentially checks only the possibility of passing a program along an algorithm branch to a certain point, the load test is a beta test production, which leads to differences in data, load, the ratio of tests in the mix, etc.

To successfully find errors, you must have a search criteria. Therefore, as a process, load testing is launched closer to the beginning of development, and in fact it passes conditionally after the first testing cycle (automatic or manual), when someone gives a flick that the whole system behaves correctly.

- It is clear that absolutely not to test everything. What points need to be tested first (in terms of load testing)?

Alexey Lavrenyuk: First of all, it is necessary to test the critical scenario - that is, the one that makes money. And to conduct at least two types of tests: for frustration, to determine the limits of performance, and for measuring timings, to make sure that the service fits into the SLA. That is, the service must be “finish off” and measure timings at the level of load that is assumed in production.

- How to start load testing?

Alexey Lavrenyuk: Before you start load testing, you need to make sure that you have performed a functional test and corrected all the bugs. And it is on your stand. Make sure that in the middle of your shooting to your booth, no one will come to download a couple of hundred gigabytes. In general, prepare a convenient test environment in which no one will disturb you.

Vladimir Sitnikov: Testing itself should start with the formulation of non-functional requirements - i.e. performance and stability requirements.

The most typical non-functional requirements (the most important quantitative estimates of the application) are the size of the data being processed, the time of their processing and the frequency of launches. Any of these metrics is found in almost every project.

The concept of "non-functional requirements" is extensive. In different projects, their components may have different priorities. For example, “the system should be able to work for 3 days without rebooting” or “in case of loss and restoration of communication with the database, the application should return to normal operation in no more than 2 minutes” - these are also non-functional requirements.

By the way, the concept of "non-functional requirements" forms the idea that it is something independent. But in fact, these requirements describe how the functionality should work. Those. without non-functional requirements, and functional ones lose their meaning, nor is it possible to attach NFR to any randomly chosen functional requirement.

- Are there any general recommendations for conducting load testing?

Vladimir Sitnikov: In general, load testing is similar to the usual. We take the most important scenarios or the ones we fear the most. Therefore, someone with an understanding of the project should come and point out the most dangerous scenarios. Next, we are already engaged in automation and measurements.

- How important is the interpretation of load testing results?

Vladimir Sitnikov: Great. It’s not the numbers that are important, but understanding why they are like that. If we got 42, this does not mean that the result is good. The customer asked to work no longer than a minute, and we managed to take half a minute. Are we great, disagree? Not! It is important to understand why we could not do faster, what we are up to. And you have to be sure that the report is real.

There was such an example: we measured how a virtual machine affects application performance, i.e. compared the performance of an application running on hardware and an application running on a virtual machine. They made measurements, got good numbers (close to the expected ones) - a discrepancy in performance of a few tenths of a percent. This could all end. But someone looked more closely at the results and realized that instead of launching the application, we returned a page with a login error. In the tests we did not pass on the real application. Instead, under the guise of each of the steps, we checked the speed of the login, which was denied on the wrong password.

What does this example mean? The fact that without an analysis of what was going on inside during the tests, we had the wrong conclusion. Therefore, from the point of view of load testing, it is not the numbers that are obtained that are important, but an understanding of how these numbers are explained.

Alexey Lavrenyuk: Very often, people underestimate the importance of graphs, look only at the final statistics. If you do that too, I recommend google the Enscombe Quartet and see this article from Autodesk.

Many popular load tools on the Internet, for example ab, provide only summary statistics and false confidence in service performance under load. They hide the details, the dips in the graphs. Such a failure can cost money (the buyer has left), and it is very easy to fix it (correct the parameters of the garbage collector).

In addition, many popular and expensive load tools are architecturally incorrect and lie to you. This is written in this article .

- What are the features of load testing a web service? In theory, it is necessary to test not only the code, but also the environment, how is this done?

Alexey Lavrenyuk: You need to test everything that interests you. If you suspect that application performance depends on traffic jams in Moscow, find a way to test it. I am not joking, our geoservices automatically start load tests on traffic data when their level reaches 7 points.

We also had a mobile application on the test, which was activated only if the phone came into motion. That is, the phone during the test had to be wobbly. Therefore, we carried these phones with us. There was even an idea to build a special telephone shatalka, but the application was sent for revision, and the need for automation has disappeared so far.

- Where do the data for load testing your typical project come from? Are mocks or production data used?

Vladimir Sitnikov: I would not say that there is a typical approach to the data. Projects are still different. There are those where the systems do not store data at all, but are intermediate links in a certain chain. To test them, we simply generate the necessary information. Sometimes, on the contrary, systems store something. If this is, for example, the history of the records, there are no problems, sort of But if the systems store some state (DB, etc.), then there are nuances.

At the initial stage, when we do not have any dump of production, we have to generate data (discuss the business structure and generate the right amount of the right kind of data). Unfortunately, I would not say that it works successfully. The generated data allows you to survive for some time, but, unfortunately, to obtain data in this way, similar to the truth and suitable for different scenarios, is difficult. And the games begin when we generate part of the data for one type of request, part for the other, part for the third. It is more convenient for us to generate parts, but at the same time we get distortions: either there is too much data (because we want to have a reserve for each type of scenario), or we skip some scenarios. Because of this, sooner or later we move to something more or less similar to dumps - import from external systems or production with masking.

There are also problems with import. We work in the telecom segment, where the dumps of production systems cannot be copied left and right. There are certain types of data that simply cannot be copied. Therefore it is necessary to invent how to make the system seem to work, but the data has been replaced by asterisks, etc.

- Isn't it easier to test immediately in production by connecting some share of users?

Alexey Lavrenyuk: In our case, this is not testing, this is a neat release. In an amicable way, always, even after testing, you need to roll out to the share of users, because in production everything can go differently than in testing.

What is the connection share of users differs from load testing? In this case, you do not have the ability to fully manage the load, you cannot bring the service to the limit (users simply receive the 500 error), you cannot turn the knobs on it with impunity and monitor as much as they could on test servers.

On the other hand, it is possible to contrive: send part of the production traffic to your test servers and use load tools there. This is already load testing, just the source data is taken in real-time from the production environment.

In other words, if you are satisfied with the answer “we hold / do not hold” (and with some probability to disappoint some share of its users), then it is possible and not to test load. If you want to know the limits of performance, bottlenecks, in which monitoring it is better to look, for which twisters to grab in the first place and to whom to run - then still do load testing.

Vladimir Sitnikov: This is not our case. There are companies that test without problems in production. They have a part of clients that can be redirected to a new version of the code. In our case, we are talking about a system that is important for the business of our customers. If it does not work, the customer loses money. Therefore, the load tests are carried out before entering the code into operation.

Our systems are important for the business of customers, and in many cases it is installed and serviced by completely different people - representatives of the customer. Therefore, our customers have no desire to experiment on the combat system, and load tests are carried out before commissioning.

- Do you have any favorite tools for load testing?

Vladimir Sitnikov: Of course. We have 3 main tools: we use Apache JMeter for testing the server side, we use Selenium for testing the browser and JMH for testing Java. These tools cover the vast majority of needs.

, , 2017 Piter. , , web-, real time show «» web- Python Tornado, , , , « », «» «».

, 2017 Piter, .

Source: https://habr.com/ru/post/329174/

All Articles

Load testing: where to start and where to look

More articles: