Monolith for hundreds of client versions: how we write and support tests

Hello!

I am a backend developer on the Badoo server team. At last year's HighLoad conference, I gave a presentation , the text version of which I want to share with you. This post will be most useful to those who independently write tests for the backend and have problems with testing legacy-code, as well as those who want to test complex business logic.
')
What are we talking about? First, I will briefly talk about our development process and how it affects our need for tests and the desire to write these tests. Then we will go up through the pyramid of test automation, discuss the types of tests we use, talk about the tools inside each of them and what problems we solve with their help. In the end we will look at how to support and run all this stuff.

Our development process

We illustrated our development process:

A golfer is a backend developer. At some point, a development task arrives, usually in the form of two documents: requirements from the business side and a technical document describing changes in our protocol for interaction between the backend and clients (mobile applications and the site).

The developer writes the code and launches it into operation, and before all client applications. All functionality is protected by some feature flags or A / B tests, this is written in the technical document. After that, in accordance with current priorities and product roadmap, client applications are released. For us, backend developers, it is completely unpredictable when this or that feature will be implemented on clients. The release cycle of client applications is somewhat more complicated and longer than ours, so our product managers literally juggle with priorities.

The development culture adopted by the company is of great importance: the backend developer is responsible for the feature from the moment it is implemented on the backend to the last integration on the last platform on which it was originally planned to implement this feature.

The situation is quite possible: six months ago you rolled out some feature, the client teams didn’t implement it for a long time, because the company’s priorities have changed, you are already busy working on other tasks, you have new deadlines, priorities - and here you are resorted to by colleagues and they say: “Do you remember this thing that you wrote down six months ago? It does not work". And instead of engaging in new tasks, you extinguish fires.

Therefore, our developers have a motivation unusual for PHP programmers to make it so that the problems at the integration stage arise as little as possible.

What do you want to do first of all to make sure that the feature works?

Of course, the first thing that comes to mind is to conduct manual testing. You take the application in your hands, but it does not know how - after all, the feature is new, customers will take care of it in six months. Well, manual testing does not give any guarantee that during the time that passes from the moment the backend is released before the integration begins, no one will break anything on the clients.

And here automatic tests come to our aid.

Unit tests

The simplest tests we write are unit tests. We use PHP as the main language for the backend and PHPUnit as the unit testing framework. Looking ahead, I will say that all our backend tests are written on the basis of this framework.

Unit tests we most often cover some small isolated pieces of code, check the performance of methods or functions, that is, we are talking about tiny units of business logic. Our unit tests should not interact with anything, turn to databases or services.

Softmocks

The main difficulty faced by developers when writing unit tests is untestable code, and usually this is legacy code.

A simple example. Badoo was 12 years old; once it was a very small startup that was developed by several people. The startup quite successfully existed without any tests at all. Then we became big enough and realized that it was impossible to live without tests. But by this time a lot of code was written that worked. Do not rewrite it just for the sake of test coverage! It would not be very reasonable from a business point of view.

Therefore, we have developed a small open source library SoftMocks , which makes our test writing process cheaper and faster. It intercepts all include / require PHP files and on the fly replaces the original file with modified content, that is, with rewritten code. This allows us to create stubs for any code. Here is a detailed description of how the library functions.

Something like this for the developer:

//mock  \Badoo\SoftMocks::redefineConstant($constantName, $newValue); //mock  : , ,  \Badoo\SoftMocks::redefineMethod( $class, $method, $method_args, $fake_code ); //mock  \Badoo\SoftMocks::redefineFunction( $function, $function_args, $fake_code );

With the help of such simple constructions we can globally redefine everything we want. Including they allow us to bypass the limitations of the standard PHPUnit mocker. That is, we can mock static and private methods, override constants and do much more, which is not possible in regular PHPUnit.

However, we ran into a problem: it seems to developers that with SoftMocks, there is no need to write test code - you can always “brush” the code with our global mocks, and everything will work fine. But this approach leads to more complex code and the accumulation of "crutches". Therefore, we have adopted several rules that allow us to keep the situation under control:

All new code should be easily tested with standard PHPUnit mocks. If this condition is met, then the code is tested and you can easily select a small piece and test only it.
SoftMocks can be used with old code that is not written in an appropriate way for unit-testing, as well as in cases when it is too expensive / long / difficult to do otherwise (underline).

Compliance with these rules is carefully monitored during the code review phase.

Mutation Testing

Separately, I want to say about the quality of unit tests. I think that many of you use such metrics as the percentage of coverage (code coverage). But she, unfortunately, does not answer one question: “Have I written a good unit test?”. It is possible that you wrote such a test, which actually does not check anything, does not contain a single assert, but it generates an excellent code coverage. Of course, the example is exaggerated, but the situation is not so far from reality.

Recently, we began to introduce mutational testing. This is a rather old, but not very well-known concept. The algorithm for such testing is quite simple:

we take the code and code coverage;
Parsing and starting to change the code: true to false,> to> =, + to - (in general, harm in every way);
for each such change-mutation we run test suites that cover the modified string;
if the tests have dropped, then they are good and really do not allow us to break the code;
if the tests have passed, most likely, they are not sufficiently effective, despite the coverage, and it may be worth looking at them more carefully, throwing some assertions (or there is an area not covered with dough).

For PHP, there are several ready-made frameworks, such as Humbug and Infection. Unfortunately, they did not suit us, because they are incompatible with SoftMocks. Therefore, we wrote our small console utility that does the same thing, but uses our internal code coverage format and is friends with SoftMocks. Now the developer runs it manually and analyzes the tests written by him, but we are working on the implementation of the tool in our development process.

Integration testing

With the help of integration tests, we check the interaction with various services and databases.

To further story was more understandable, let's develop a fictional promo and cover it with tests. Imagine that our product managers decided to distribute conference tickets to our most loyal users:

A promo should be shown if:

the user in the field "Work" indicated "programmer",
the user participates in the HL18_promo A / V test,
user is registered more than two years ago.

By clicking on the button "Get a ticket" we must save the data of this user to some list in order to transfer to our managers who distribute tickets.

Even in this rather simple example, there is a thing that cannot be verified using unit tests, interaction with a database. To do this, we need to use integration tests.

Consider the standard way to test database interaction offered by PHPUnit:

Raise the test database.
Prepare DataTables and DataSets.
Run the test.
Clearing the test database.

What difficulties await us with this approach?

You need to support the structures DataTables and DataSets. If we change the scheme of the table, then it is necessary to reflect these changes in the test, which is not always convenient and requires additional time.
It takes time to prepare the database. Every time we set up a test, we need to fill something in there, create some kind of tables, which is long and troublesome if there are a lot of tests.
And the most important drawback: the parallel launch of such tests makes them unstable. We started the test A, he began to write in the test table, which he created himself. At the same time, we launched test B, which wants to work with the same test table. As a result, interlocks and other unforeseen situations arise.

To avoid these problems, we developed our own small DBMocks library.

DBMocks

The principle of operation is as follows:

With the help of SoftMocks, we intercept all the wrappers through which we work with databases.
When
the query passes through the mock, parses the SQL query, and pulls out the DB + TableName from it, and from the connection we get the host.
On the same host in tmpfs, we create a temporary table with the same structure as the original (copy the structure with SHOW CREATE TABLE).
After that, all requests that will be received via mocks to this table will be redirected to the newly created temporary one.

What gives us this:

no need to constantly take care of the structures;
tests can no longer damage the data in the source tables, because we will redirect them to temporary tables on the fly;
we are still testing compatibility with the version of MySQL we work with, and if the request suddenly ceases to be compatible with the new version, then our test will see it and fall.
and most importantly, the tests are now isolated, and even if we run them in parallel, the threads will diverge according to different temporary tables, since we add a unique key for each test to the names of the test tables.

API testing

The difference between unit and API tests is well illustrated by this gif:

The lock works fine, but here it is attached to the wrong door.

Our tests imitate the client session, are able to send requests to the backend, following our protocol, and the backend answers them as a real client.

Test user pool

What do we need to successfully write such tests? Let's return to the conditions of our promo:

the user in the field "Work" indicated "programmer",
the user participates in the HL18_promo A / V test,
user is registered more than two years ago.

As you can see, everything is about the user. And in reality, 99% of API tests require an authorized registered user, which is present in all services and databases.

Where to get it? You can try to register it at the time of testing, but:

it is long and resource intensive;
after the completion of the test, this user needs to somehow be removed, which is quite a non-trivial task if we are talking about large projects;
Finally, as in many other high-load projects, we perform many operations in the background (adding a user to various services, replicating to other data centers, etc.); tests do not know anything about such processes, but if they implicitly rely on the results of their implementation, there is a risk of instability.

We have developed a tool called the Test Users Pool. It is based on two ideas:

We do not register users every time, but use them repeatedly.
After the test, reset the user data to the original state (at the time of registration). If this is not done, the tests will eventually become unstable, because users will be “contaminated” with information from other tests.

It works like this:

At some point, we wanted to run our API tests in a production environment. Why do we even want this? Because the devel infrastructure is not the same as production.

Although we are trying to constantly repeat the production-infrastructure in a reduced size, devel will never be its full copy. To be absolutely sure that the new build meets expectations and there are no problems, we post the new code on the preproduction cluster, which works with production-data and services, and we run our API tests there.

In this case, it is very important to think about how to isolate test users from real ones.

What happens if test users start to appear real in our application.

How to make insulation? Each of our users has an is_test_user flag. At the registration stage, it becomes yes or no , and no longer changes. By this flag, we isolate users in all services. It is also important that we exclude test users from business intelligence and A / V testing results in order not to distort the statistics.

It is possible to go in a simpler way: we started by saying that all test users were “resettled” to Antarctica. If you have a geoservice, this is quite a working method.

QA API

We do not just need a user - we need him with certain parameters: to work as a programmer, participate in a particular A / V test, and was registered more than two years ago. We can easily assign a profession to test users using our backend API, but getting into A / B tests is probabilistic. And the registration condition more than two years ago is generally difficult to fulfill, because we do not know when the user appeared in the pool.

To solve these problems, we have a QA API. This is, in essence, a backdoor for testing, which is a well-documented API-methods that allow you to quickly and easily manage user data and change their state, bypassing the main protocol of our communication with customers. Methods write backend developers for QA engineers and for use in UI and API tests.

QA API can be used only in the case of test users: if there is no corresponding flag, the test will immediately fall. Here is one of our QA API methods that allows you to change the date of user registration to an arbitrary one:

And there will be three calls that will allow you to quickly change the data of the test user so that they satisfy the conditions for displaying promo:

In the field "Work" is indicated "programmer":
addUserWorkEducation?user_id=ID&works[]=Badoo,
The user participates in the A / B test HL18_promo:
forceSplitTest?user_id=ID&test=HL18_promo
Registered more than two years ago:
userCreatedChange?user_id=ID&created=2016-09-01

Since this is a backdoor, it is extremely important to think about security. We protected our service in several ways:

isolated at the network level: services can only be accessed from the office network;
with each request, we pass the secret, without which it is impossible to access the QA API even from the office network;
methods work only with test users.

Remotemocks

To work with a remote API test backend, we may need mocks. For what? For example, if the API test in a production environment starts to access the database, we need to make sure that the data in it is cleared from the test data. In addition, mock'i help make the answer of the test more suitable for testing.

We have three texts:

Badoo is a multilingual application, we have a complex localization component that allows you to quickly translate and receive translations for the current location of the user. Our localizers are constantly working to improve translations, conduct A / B tests with tokens, looking for more successful formulations. And, conducting the test, we can not know what text will be returned by the server - it can change at any time. But we can use RemoteMocks to check whether the localization component is correctly addressed.

How do RemoteMocks work? The test asks the backend to initialize them for their session, and when all subsequent requests are received, the backend checks for the presence of mocks for the current session. If they are, it simply initializes them using SoftMocks.

If we want to create a remote mock, then we indicate which class or method should be replaced and by what. All subsequent requests to the backend will be executed with this mock in mind:

 $this->remoteInterceptMethod( \Promo\HighLoadConference::class, 'saveUserEmailToDb', true );

Well, now let's build our API test:

 //       $app_startup = [ 'supported_promo_blocks' => [\Mobile\Proto\Enum\PromoBlockType::GENERIC_PROMO] ]; $Client = $this->getLoginedConnection(BmaFunctionalConfig::USER_TYPE_NEW, $app_startup); //  $Client->getQaApiClient()->addUserWorkEducation(['Badoo, ']); $Client->getQaApiClient()->forceSplitTest('HL18_promo'); $Client->getQaApiClient()->userCreatedChange('2016-09-01'); //     $this->remoteInterceptMethod(\Promo\HighLoadConference::class, 'saveUserEmail', true); //,   ,   $Resp = $Client->ServerGetPromoBlocks([]); $this->assertTrue($Resp->hasMessageType('CLIENT_NEXT_PROMO_BLOCKS')); $PromoBlock = $Resp->CLIENT_NEXT_PROMO_BLOCKS; … //   CTA, ,   ,   $Resp = $Client->ServerPromoAccepted($PromoBlock->getPromoId()); $this->assertTrue($Resp->hasMessageType('CLIENT_ACKNOWLEDGE_COMMAND'));

In this simple way we can test any functionality that comes to the development in the backend and requires changes in the mobile protocol.

API test usage rules

It seems to be all right, but we again encountered a problem: API tests turned out to be too convenient for development and there was a temptation to use them everywhere. As a result, once we realized that we were starting to solve with the help of API tests, tasks for which they were not designed.

Why is that bad? Because API tests are very slow. They go around the network, refer to the backend, which raises the session, goes to the database and a bunch of services. Therefore, we have developed a set of rules for the use of API tests:

the purpose of the API tests is to check the protocol of interaction between the client and the server, as well as the correctness of the integration of the new code;
it is permissible to cover complex processes with them, for example, chains of actions;
it is impossible to test with them using small variability of the server response - this is the task of the unit tests;
during the code review we check including the tests.

UI tests

Since we are considering the automation pyramid, I’ll tell you a little about UI tests.

Back-end developers in Badoo do not write UI tests - for this we have a special team in the QA department. We cover the feature with UI-tests when it has already been brought to mind and stabilized, because we believe that it is unwise to spend resources on a fairly expensive feature automation, which, perhaps, will not go further than the A / B test.

For mobile autotests, we use Calabash, and for the web, Selenium. It tells about our platform for automation and testing.

Test run

We now have 100,000 unit tests, 6,000 integration tests and 14,000 API tests. If you try to run them into one stream, then even on the most powerful of our machines, a full run takes all: modular - 40 minutes, integration - 90 minutes, API tests - ten hours. This is too long.

Parallelization

We described our experience in parallelizing unit tests in this article .

The first solution, which seems obvious, is to run tests in several threads. But we went further and made a cloud for parallel launch in order to be able to scale hardware resources. Simplified his work looks like this:

The most interesting task here is the distribution of tests between threads, that is, their breakdown into chunks.

You can share them equally, but all the tests are different, so there may be a strong bias in the execution time of a thread: all threads have already reached, and one hangs for half an hour, as he is “lucky” with very slow tests.

You can run multiple threads and "feed" them tests one by one. In this case, the disadvantage is less obvious: the initialization of the environment has overhead costs, which, with a large number of tests and this approach, begin to play an important role.

What did we do? We began to collect statistics on the run time of each test, and then began to assemble chunks so that one stream according to the statistics was executed for no longer than 30 seconds. At the same time, we rather tightly pack tests into chunks to make them smaller.

However, our approach also has a flaw. It is associated with API tests: they are very slow and take up a lot of resources, preventing fast tests from running.

Therefore, we have divided the cloud into two parts: in the first, only fast tests are run, and in the second, both fast and slow tests can be launched. With this approach, we always have a piece of cloud that is able to handle quick tests.

As a result, unit tests were run in a minute, integration tests - in five minutes, and API tests - in 15 minutes. That is, a full run instead of 12 hours takes no more than 22 minutes.

Test run based on code coverage

We have a large complex monolith, and, in an amicable way, we need to constantly chase all the tests, because a change in one place may break something in another. This is one of the main drawbacks of monolithic architecture.

At some point we came to the conclusion that you do not need to run all the tests every time - you can do runs based on code coverage:

We take our branch diff.
We form the list of changed files.
For each file we get a list of tests,
which cover it.
From these tests we create a set and run it in a test cloud.

Where to get coverage? We collect data once a day when the development environment infrastructure is idle. The number of tests to be run has noticeably decreased, the speed of receiving feedback from them, on the contrary, has increased significantly. Profit!

An added bonus was the ability to run tests for patches. Despite the fact that Badoo has not been a start-up for a long time, we can still quickly implement changes to production, quickly pour out a hot fix, roll out features, change the configuration. As a rule, the speed of rolling out patches is very important to us. The new approach gave a big increase in the speed of feedback from the tests, because now we don’t have to wait long for the complete run.

But without flaws anywhere. , , , . . , code coverage . , , — , - , . .

API-, code coverage. , , . - , API- .

Conclusion

, . - , , - .
≠ . code review , .
, , . .
. .
, ! , .

, Badoo PHP Meetup 16 . PHP-. , . ! 12:00, — YouTube- .

Source: https://habr.com/ru/post/443768/

All Articles