Preparing a test environment, or how many test instances you need
How many test benches in your project are 5, 10 or more than 10? Offhand, we need stands for each development team, stands for QA for each project, project managers also need stands, and also CI - it’s difficult to precisely differentiate everything and not cause conflict situations. In a word, why don't we make a test bench exactly when it is needed? We need a test stand now - we made it, we don’t need it - we removed it.
This approach was suggested by Alexander Dubrovin ( adbrvn ) on Highload ++ 2017 in his report, which you will find the decoding under the cut.
')
About the speaker: Alexander Dubrovin works in Superjob. It is known that the projects of this company are highly loaded. But today we will not talk about how many users visit the portal, and how much data is stored on the servers, but will touch on other indicators.
Looking ahead, we say that, in fact, Superjob do not know how many test benches they have. But first things first. Let's start with a little story.
A bit of history
Imagine a small project S. There is a team of developers in it who need to test their code somewhere. To organize testing, we will put the test machine, make it look like a production, roll the code there, launch it and the developers will be able to test something there.
At some point, the team begins to grow, and requires a staff of testers. QA-specialists appear, and they also need to be tested somewhere.
You can use a simple approach - select a section for testers, roll the same copy there, and now they can test. Everything is great and good!
The project continues to grow, and an additional development team appears. They also need to test somewhere. The approach is already familiar - we are separating another part of the test server.
But in fact, the teams are also growing - one test bench is not enough for them. They also do more tasks, so testers need a lot of testing.
At about this stage, you can begin to notice interesting stories. Suppose there is a tester Vasya, who wants to test some problem. He chooses a test stand, rolls the code in there and starts testing. She clicks, clicks and realizes that something is not right, something is not working, and in general the task is not done.
In JIRA, tickets begin to fall, developers begin to gather near Vasya with the words: “But how is that? Yet done! "And someone finally asks:" And what is your branch on the test rolled out? "Bob looks - not the same. The branch is quickly corrected, tickets in JIRA are closed, everything is fine. Vasya continues to test, everything works for him.
But at this time, on the other side of the room, developer Vova thinks: “It's strange, but why isn't it working for me?” But he quickly realizes that the branch is not the same. Rolls out the one you need, and Vasya again has problems.
In a couple of iterations, they understand that they are simply testing on one test bench and interfere with each other. As a result, time is wasted , Vasya and Vova are unhappy.
Other story. The developer, Kolya, knows about Vasiny’s problems, comes to him in advance and asks which test bench is currently free. Vasya points free, and all is well. After a couple of days, they meet again, and Vasya asks Kolya: “Will you return the test stand to us? You occupied him for an hour, and 2 days have passed. ”
And again the problem is either the developer is looking for another stand, or everyone will be cheerfully waiting until he finishes testing.
In fact, the diagram above does not display everything. There are not enough managers here. Sometimes managers want to see raw code that has not yet been tested. Standard approach - we again select a corner of the test server and make more test stands.
The very last part of this system is of course CI - he also wants to shoot somewhere, test somewhere.
Smoothly developing such a scheme, we get an uncontrolled change in test benches . The scheme is bad because we really do not control such stands - we do not know:
who is testing on this stand at the moment;
what is rolled out there;
we do not know at all whether he is busy or free.
Moreover, we do not know which test stands are now free, maybe at some time there is a queue.
Idea
At this point, we thought - what to do? Why do we need so many test benches? Why don't we make a test bench exactly when you need it? We need a test stand now - we made it, we don’t need it - we removed it.
The next step in this idea is to make a test stand for each branch of code .
It seems to be a good idea, but there are technical nuances. We need stands:
independent ;
very similar to production - we want to test closer to the production environment;
with the ability to quickly create stands , because we want, so that as soon as a branch appears, a test stand would immediately appear under it.
Ideally, we want one button “ Make a test stand ”.
Harsh reality
There is still a harsh reality in which we have:
A big complex project - a huge php monolith with a fairly long history.
Service in four domain zones. We simultaneously support the Russian, Ukrainian, Uzbek and Belarusian zones.
A bunch of subdomains — geopdomains and service subdomains — such as the API, students.superjob.ru, and so on.
And all this at some point testers will want to test. Even if we are not testing it in Ukraine now, tomorrow there will be a task to make a special page for the Ukrainian part - this must also be taken into account.
No sooner said than done!
Docker / docker-compose
First, we talked about the fact that test benches should be isolated and as similar as possible. Nowadays, this allows docker to be implemented. He will give the opportunity to run containers. Obviously, we will not manage with one container , moreover, we need to run a bunch of similar stacks. Therefore, docker-compose is needed.
Great - we will use the docker - it's stylish, fashionable, youth.
Sawing monolith allocate services
Docker promotes the microservice approach and here we face a problem because we have a monolith.
Have you ever tried to estimate how much it costs to cut a monolith on microservices? Obviously, this figure is measured in person-years.
At some point we looked at the component scheme of our system and saw that here we have load-balancing, here is the application for php, here is the node.js application. Why don't we run exactly this as a service. Let's find what we can run in docker containers.
Configuring the network
Then we need to somehow reach our test bench. Naturally, we need to pull the 80th port out to the browser so that the browser can open our test bench, but if such booths will be launched within one machine, we need to issue IP.
The documentation has a whole huge section on network configuration.
Docker can use various types of networks. In our case a network like macvlan helped a lot. This is a technology that allows for a single physical network interface to implement a stack of virtual network interfaces. At the same time, the docker will manage these interfaces on its own: create, add to the machine and receive external IP addresses relative to the host machine.
Thus, we can launch a pack of containers, give the front container (balancer) the opportunity to get an external IP address and open the 80th port on it. We can already knock there using a browser.
Raise DNS and API
We remember that we have domain zones and a lot of subdomains. Thus, we can turn to the test bench only by the level 2 domain. There is a huge plus, and a huge minus:
Minus - we obviously have to somehow overlap the real domains in the zone ru, ua, uz, by.
Plus - as a level 2 domain, you can sew directly the specific name of the branch - we make test stands for each code branch.
Thus, we will get a clear, clear and clear address , which will already contain an indication of the code branch.
Minus costs really simple. If we have to overlap domains, we simply add the prefix and thus limit the set of overlapping domains - this can already be tolerated.
In our case, we chose the prefix sj. It turns out, we have to block domains only with the sj prefix - there are obviously few of these.
Another part of the DNS is the API. As already mentioned, it is necessary to raise test benches quickly. Therefore, we need a DNS server that allows you to quickly add and quickly remove an API record in automatic mode.
The solution is PowerDNS . This server allows you to quite quickly and simply tie the API to it and add and remove test benches using scripts.
Wonderful! We raised and configured the DNS, taught our containers to prescribe their IPs into it, but something is missing.
Making SSL-CA
We live in the 21st century. Obviously, the entire Internet - SSL and test benches must support SSL . Quite a lot of bugs are specific to SSL, and mixed content is only the tip of the iceberg.
So, we need a way to quickly get a certificate and quickly apply it on a rising test stand. Our company already had an OpenSSL based certificate authority. Here we went by a simple method of writing your bike.
The bicycle is written in one day and allows you to get certificates generated for a specific domain name using GET requests.
It remains the least. It is necessary to automate it, because we want to do all this with one button.
Automating
For ourselves at the initial stage, we wrote a console script that allows you to simply raise the test bench or remove it.
Obviously, testers are not very comfortable. Therefore, it is possible, for example, to make a special assembly that will assemble a test stand and launch it.
But in fact, the coolest step in this plan is to add such a button directly to the JIRA ticket. Imagine, your tester opens a JIRA ticket, reads a request, presses a button and gets a test bench in a couple of minutes - great?
pros
Initially, we planned this for manual testing , so that the tester could run and click on any version of his code, and it worked great.
The next extra bonus is that we have demo hosts . This is the same, only the project manager, not the tester, enters the JIRA ticket. He can also see and click the raw code.
We got a huge plus for CI . When we trained CI in the same way to raise a test bench for a specific version of the code and then delete it, we had the opportunity to run absolutely any tests for any branch. Even the most complex interface selenium tests I can get rid of for any branch in my project with one click.
Minuses
But there are also disadvantages, I would say, nuances in the management of stands . Need to learn how to manage them.
We rolled out the first version of our system to an old weak server and set up the creation of test stands for each new branch. Of course, somewhere in a day and a half the server failed , simply because there are a lot of branches.
Then, we stopped creating them automatically, and a button appeared in JIRA, CI learned how to start and stop test benches, collect logs from them.
Definitely there is a class of tasks that this system will not allow to solve. For example, often a pop up problem is the total time for all containers. Some tasks would be convenient to test by shifting the time on the server. This system, unfortunately, does not allow to solve such problems. But such tasks can be solved by adding special branches to the code for testing, so that, for example, you can see how the form behaves after 2 weeks.
Total
At the entrance, we had a test bench system that made us look for a test bench and did not guarantee us that no one would interfere with each other at these test benches.
It was : "Vasya, and what test is free - I need to roll out my task to test it."
The output turned out to be one button , which you can click and get in a couple of minutes a ready-made test bench for a specific version of the code. Even if several people use this stand, it’s guaranteed that these people want to watch this particular version of the code.
It became : "I press the button and after a minute and a half I get a new test stand for a specific task."
Bonus we got all the tests in one click. As I said, any tests on any branch directly from the CI are selected with one button. Then the machine will do everything by itself: raise the test stand, fire it, collect logs from it and remove it.
Returning to my first question, how many test benches do we need? I do not know how many test benches we need, because today they need 20, tomorrow - 15, the day after tomorrow 25.
But I know for sure that we have exactly as many test benches as we need here and now .
Time flies unnoticed, and quite a bit remains before the RIT ++ conference festival, we recall it will take place on May 28 and 29 in Skolkovo. Taking this opportunity, we present a small selection of RootConf applications for a wide range of listeners:
Daniel Migalin (Microsoft). Automatic, reliable and manageable deploy with simple tools.