Load testing CMS "1C-Bitrix"

Do you know the joke about the plane, in which there is a bar, and a swimming pool, and a restaurant, but only when taking off, the flight attendant says: “And now with all this we will try to take off”?

Web development is a bit like an airplane. The customer wants from the web studio both a cool design, and a lot of interactivity, and all delivery and payment services in online stores, the studio is happy to program all this ... But it’s not clear whether the server’s capacity is enough to ensure stable operation of the site.
In order for the load to be predictable, in order to set some reference values, we conducted load testing of “1C-Bitrix: Site Management” and “1C-Bitrix: Enterprise”.

We tried to test it so that the client understood what can be obtained on current equipment, and the developer could understand what the prospects for the project are. Will it be able to scale with increasing load?
')
In this article we will talk about how they organized and carried out the testing, and what conclusions they made for themselves.

What can it be useful for? The reference numbers will make it possible to compare the current site with the potential new hosting, to clearly assess what effect the introduction of new functionality has on the system. And just to understand the technical limits of the system.

We focused on the organization of testing, on the specific problems encountered in the testing process and on what conclusions can be drawn from the test results. For those most interested, here is a link to a detailed technical report .

Formulation of the problem

Many differently understand the meaning, purpose and objectives of load testing.

First, you need to formulate for yourself - what do we want?

What product and how to test?
First, we wanted to take a real online store, its code and base. But it would be testing this particular solution; other developers will not be able to use it as a reference point. So, it is necessary to test the standard “box”, filling it with a large number of positions corresponding to a large online store (in our experience, this is about 100,000 SKU). “1C-Bitrix” worked with the “Composite Cache” technology enabled - a solution that first loads the fast cache of the saved page from nginx, and then loads the dynamic data with an ajax-request. Thus, the user receives the page as quickly as possible. To estimate the number of requests per second, we considered exactly dynamic requests.

What should be the server architecture?
Be sure to test the single-server configuration: it is the simplest and most popular. But it is also interesting to understand how the performance will increase if you add a second server? Will it grow twice? What happens if you use a cluster of four machines? Is it possible to successfully scale horizontally by adding new and new servers to the cluster?

Thus, it was decided to test configurations from one server, from two and from four.

What need to create load scenarios?
According to the analysis of the load on a number of large stores, the online store traffic can be divided into 3 categories:

60% of users come and browse several pages in the catalog, knowing what specific product they need.
37% of users choose the right product from several possible, apply filtering (smart filter).
3% of users (the standard indicator of good conversion) put the product in the basket and bought it.

How to load servers?
There are two main models of load testing of the project: closed and open. Closed means an artificial constant static load, in which the requests of "users" go in a static stream (and not in waves, as in real life). Its goal is to find out the behavior of the project at maximum load, to understand the maximum “throughput” of the system. This is the top bar of the metric, it is no longer above it.

An open model implies testing that is close to real life: users arrive in waves, sometimes more than the system can withstand. It helps to figure out the limits of the system, how it will behave in critical conditions.

In this test, our goal was to find out the capacity of the system, so we chose a closed model with the maximum load on which the SLA will be sustained.

Which SLA to choose for testing?
We needed to clearly define for ourselves that threshold, below which we consider the system to be consistently serving requests. To do this, we used the statistics of the TOP-100 major online stores (according to Kommersant, 2014) and chose two main metrics: the answer to 99% of requests should be given in less than 1 second, and the number of answers with a code other than HTTP 200, should be no more than 0.5%. Testing should be carried out continuously for 24 hours.

Which hosting to choose?
We chose Selectel hosting with the Tsvetochnaya-1 data center in St. Petersburg as a platform for testing. Selectel provided us with its most popular configuration servers - Intel Xeon E3-1270v3 3.5 GHz, 32 GB of RAM, 2 × 240 GB SSD-drives in RAID 1. One of the servers was used as a load center, the rest, in different configurations " 1C-Bitrix ", we used for testing.

system configuration
As part of the testing, the servers worked on Linux OS CentOS 6.6 with the 1C-Bitrix: Web Environment package installed on it, you can read more about it here .

PHP in the system was updated to version 5.6.9, and the cache directories were mounted in tmpfs. Three configurations were prepared for testing:

1) 1 server. "1C-Bitrix: Site Management", web and MySQL work on the same server

2) 2 servers. "1C-Bitrix: Enterprise", the nginx balancer on the first server, the web application on both servers, the MySQL master and read / write to it on the first server, the MySQL slave and read from it on the second server

3) four servers. A variant in which the second configuration is horizontally scaled by the slave machine.

What to test?
Yandex.Tank was chosen as a testing tool. Yandex.Tank allows using two load generation systems - Phantom and JMeter. Phantom is superior to JMeter in terms of performance, but does not allow generating POST logic, saving and subsequent use of cookies. For this reason, we generated the load using JMeter.

Who is testing?
We wanted to be tested by an independent company, whose opinion we, and third-party companies, and industry experts could trust. We entrusted this task to ITSumma , which since 2008 has been engaged in round-the-clock administration and technical support of well-known projects in RuNet and has established itself in the market. Its employees are regular speakers of relevant conferences (such as Highload, RIT ++, etc.).

Testing

In retrospect, testing can be divided into three stages.

The first stage is a fitting room, within which we selected the maximum number of streams at which the SLA will be maintained:

single server configuration: 34 threads
two server configuration: 74 threads
four server configuration: 136 threads

The second stage is the first attempts to launch a full test. When planning load testing, you need to understand that it will take some time to refine the system on which the test will be performed, and to work out the test procedure itself.

At this stage, we are faced with a number of difficulties. From the experience of overcoming them, we learned a number of lessons:

When organizing load testing, remember that the first large test run will be far from ideal results. Perhaps you will forget something in the configuration of the OS and server software. There may be problems with the testing system itself (JMeter is a Java application, with its inherent problems with garbage collection ). Well and the main thing - you will see thin places in your system that you have not noticed before and which can be corrected. For example, as part of our testing, nine significant bugs were discovered, the fixes of which will be released in the next versions of the product. So consider your first test as an optimization guide.
Somewhat obvious thing, which, however, is to say: during the optimization all configuration changes on the servers should be recorded. Ideally, do not apply more than a few changes at once. Otherwise, it will be difficult to figure out exactly which action led to an improvement in performance.
Daily testing in the presence of any error that is not immediately detected is a lost day of work. After one of the iterations, when we made a number of changes to the configuration, we got good results and were happy about it. But then they discovered that the third scenario was disconnected - the execution of user orders, the hardest in our case. I had to start testing again. After a day, we found out that the applied changes do not speed up the system.
A system under load test is a production system, and all the typical problems that occur on production sites can happen to it. We have in one of the tests, 12 hours after launch, one of the servers ran out of space. Therefore, it is vital to put standard alerts about problems on the site in their usual monitoring system so that they come to your phone. And upon receipt, you need to immediately run to correct the situation, so as not to lose the already collected precious data.
Good results most often mean testing errors. In one iteration, we received a two-fold increase compared with the previous test. It turned out that MySQL “crashed” through max connections, and the site in such a situation responded with a valid HTTP 200 code for the testing system.
In any testing, “bureaucracy” is very important: make the most detailed description of the test performed - scenarios, system configuration, test logs, etc. All this will be useful for further analysis.

The third stage is the test itself. After all procedures are worked out, and the system is optimized, it is better to run the test several times. In our experience, this allows you to get a really accurate result, which even on 24-hour testing will be almost identical to the same test. In repetitions of our tests, we obtained results with an accuracy of queries per second.

Test results

Configuration 1 (1 server, “1C-Bitrix: Site Management”, edition “Business”)

Test run time: 86,892 seconds

The number of PHP requests per second: 167, with 34 simultaneous streams
99th percentile: 0.366 ms.
Viewed Pages: 14,421,563
The percentage of outstanding requests: 0.31%

Yandex.Tank percentile

Shelf RPS (upper limit - 350 requests per second, including "composite" requests - 167 dynamic requests per second).

This is a good result for a complex system, and it is important to note that the main load is created by the complex procedures of adding goods to the basket and its design.

Configuration 2 (2 servers, "1C-Bitrix: Enterprise")

Test run time: 86,850 seconds
The number of PHP requests per second: 265, with 74 simultaneous streams
99th percentile: 0.95
Number of viewed pages 23 082 301
The percentage of outstanding requests: 0.47%
Scale factor versus 1: 1.60 configuration

Yandex.Tank percentile

"Shelf" RPS (including "composite" requests - the upper limit - 550 requests per second).

The number of requests per second did not grow as much as we would like - 1.6 times. It is important to remember that in the multi-server configuration, server-to-server communication is added, and an additional overhead on the data exchange.

Configuration 3 (4 servers, “1C-Bitrix: Enterprise”)

Test run time: 86,402 seconds
PHP requests per second: 535, with 136 simultaneous streams
99th percentile: 0.9
Number of the browsed pages: 46 256 141
The percentage of outstanding requests: 0.47%
Scale factor compared to configuration 2 - 2.00

Yandex.Tank percentile

"Shelf" RPS (including "composite" requests - the upper limit - 1100 requests per second).

And here - an excellent result for scaling. By adding two more servers to the two-server configuration, we received a twofold increase in performance. This result suggests that you will not run into the vertical limits of the hardware, additional servers will allow you to adequately distribute the load between the machines.

Results and future plans

With this testing, we tried to set a certain benchmark to which developers can orient themselves when evaluating the performance of their systems. For us, a very important result was the confirmation of the high efficiency of the horizontal scaling of “1C-Bitrix”: the use of four servers instead of two and gave a double increase. And, of course, a pleasant metric - 167 requests for dynamics in a complex system on a single server.

Now we plan to conduct a test in the open system to find out for ourselves the following points:

What happens to the system after reaching its limits within a single server, what can be optimized in it?
How fast will the system recover from ultrahigh load?
How to get such a toolkit so that developers can estimate the real number of users that can serve the project they have created on current equipment?

We will tell you about the results of this test and the experience.

Useful links:

Detailed report on load testing 1C-Bitrix
Alexey Lavrenyuk's report on Yandex.Tank at the FailOver Conference 2015 conference
JMeter tuning

Source: https://habr.com/ru/post/262029/

All Articles

Load testing CMS "1C-Bitrix"

Formulation of the problem

Testing

Test results

Results and future plans

More articles: