How to perform many UI tests in parallel using Selenium Grid?

Hello! I work at Avito and develop test tools. When we had many UI tests, we faced the problem of scaling Selenium servers, and now I will tell you how we solved it.

And so how can you still run many UI tests in parallel using the Selenium Grid? Unfortunately - no way.
Selenium Grid is not able to perform a large number of tasks in parallel.
Want to register a really large number of nodes? Well, try it.
Want speed? It will not be - the more nodes registered on the grid, the less stable each test is performed. As a result - restarts.
Want fault tolerance in case the Grid stops responding? Also not: you can not run a few replicas and put a balancer in front of them.
Do you want to update the Grid without downtime and so that tests that are currently running do not fall? No, this is not about Selenium Grid.
Do you want not to keep thousands of Seleniums of different configurations in memory, but to pick them up on demand? Will not work.
Want to know how to solve all these problems? Then I invite you to read this article.
* (My report with the same name has already been heard at Heisenbug 2017 Moscow , and perhaps someone of the readers is familiar with it. Under the cut is a more detailed text version of the story about the tool).

A small digression on how the Selenium server works.

In order to start managing the browser, you need to send a create session request to the Selenium server.
As a result, a browser is opened on the node, and the sessionId token is returned to you, sending which in each request you control the browser.

Okay, why do we need Selenium Grid? Selenium Grid provides a single point for working with many Selenium-servers of different configurations:

it allows you to create a session on a free node that fits your filtering criteria, for example, according to the browser version;
stores information about which session is open on which node and proxies all requests to the target node, so for the client there is no difference to work with one node, or with a grid.

Wonderful tool, right?

But when using it, we encountered a number of problems.

1. Unpredictable behavior
In short, then you will fall off what he wants and when he wants, and you can not influence it.

We very often came across situations where tests worked perfectly in one thread, but with multi-threading through the grid there were unpredictable drops.
Periodically, tests simply didn’t fall on a part of the nodes, although physically they were available, a queue of tests accumulated on the grid. As a result, half of the release suite fell off by timeout.

2. Lack of support for a large number of nodes
If you try to register many nodes (and we want many nodes), registration will occur, but testing the application in many threads will still fail, since most of the tests will fall.

3. Scalability
The first thing that comes to mind when the node = N limit on the selenium grid is reached, at which stability does not suffer, is to take two, three, five, (yes, at least ten) grids, register for each N nodes, mash up all this good some balancer and run tests in 10 * N threads. But no, the Selenium Grid does not work that way. Because all the information about nodes and sessions is stored in the memory of a particular node and is not fumbled between them. The following problem is closely related to this.

4. Fault tolerance
If you turn off the machine where the hub is located, then all tests immediately die, because you do not have any backup hubs that the following requests can take, for again, everything is in memory. And this is absolutely not scaleable (of course, you can always rewrite a couple of classes of the grid, but more on that later). The weak point is Selenium Hub, when it falls, the nodes become inaccessible.

5. Lack of ability to create nodes dynamically using container orchestration system
If for testing you need a lot of different node configurations with different browser configurations, then another problem arises: this whole zoo takes up quite a lot of memory space. Suppose you have: 300 nodes with Google Chrome (150GB RAM) + 300 nodes with Firefox (150GB RAM) and another 200 nodes of some Firefox Nightly with magic plugins (100GB RAM). 400GB RAM is always busy, plus you want to efficiently redistribute nodes during the day, say, to occupy all 400GB with seven hundred chrome when testing one suite and flexibly replace them when tests with other needs appear in the queue.

Docker is ideal for this task, as it allows you to quickly pick up a container with fresh Selenium and kill it just as quickly after the test is completed. And since we need a lot of seleniums, all this doesn’t fit on one iron server, there is a need to orchestrate containers on a cluster. In the market there are several popular solutions for this problem, we use Kubernetes. Why we chose Kubernetes, you can listen here . Standard tools Selenium not solve this problem.

6. Unable to update / restart grid without downtime
Another consequence of storing sessions in memory. Not that this is a supercritical minus, but still unpleasant.

All of the above is a situation in which we once found ourselves.

Known solutions

Grid Router and the new implementation of the Go Grid Router is a good solution, but unfortunately far from perfect. The main ~~problem is the~~ feature, that it is not a replacement for Selenium Hub, it is another proxy above.

Hence the name - Grid Router, because it manages not grids, but grids, so there are downsides.

An attempt to create a new session does not occur on a grid with free nodes, but on a random one (you can control the distribution of a random variable with the help of weights). If one of the grids failed to create a session, the request will go to the next one, and so on until the grids run out. Thus, the time to create a new session can be delayed for considerable periods of time.
If one of the selenium hubs falls, then all information about the sessions will be lost, and the nodes will be disconnected from the network. Since so far all interactions go through the hub and data about the sessions is stored in the hub.
It is quite difficult to add another hub to the system, because the hub data is stored in xml files and file synchronization occurs at the signal of the operating system. No transactions, everything is bad.

Selenoid is a tool for running tests in docker containers. With each request to create a session, a fresh container is launched and deleted when the session is closed. The tool is wonderful, but there are downsides:

does not support any orchestration system;
still stores session information in memory, and, as a result, has problems with scaling and fault tolerance.

When we faced all these problems, we decided to ask about the experience of other companies. “Yandex” wrote in a blog post on Habrahabr that it’s impossible to register many nodes and work with them, they use Grid Router to solve this problem. For our tasks, the Grid Router is not suitable.

Alfa-Bank wrote that everything hangs on them if the grid is not used for a while, and our experience confirms this - we had the same thing regularly.
Of course, we did not cheat github selenium, where we found a few issue ... Here is an example of the attitude of the authors to what is happening:

Q: “selenium-grid version 3.0+ support hub high availability?”
A: "If you’re a hub."

We realized that we had nothing to hope for, and began to solve our problems ourselves.

Study

We decided to start from a simple path, we stole a number of seleniums in the kubernetes cluster, put ip into the database and went to the base directly in the setUp() test, took the ip that was not used for the longest, and ran the test, never storing the sessionId and not blocking the nodes . Since the workers with the tests were <number of seleniums, overflow should not occur.

This decision immediately showed its viability.

We got:

predictable behavior;
fault tolerance at the database level;
scalability;
support a large number of nodes;
upgrade without stopping the tests, because it's just the code that you have in the repository, and it starts when the tests run.

But faced with a number of problems:

no support for Capabilities selection mechanism;
no convenient mechanism / grid / register;
there is no Portability - the system no longer works as a service, it depends on one programming language and is implemented in one repository with tests.

The last problem is the most important, because if you stitch it into the code of the test framework, then you automatically need to maintain it in each of your test frameworks, in all repositories in all the languages used.

The most important thing in this experiment is the experience gained. We have seen that the Selenium Grid can be implemented normally.

Final decision

First of all, we began to consider the idea of a fork / pull requisition of selenium. But after a more detailed acquaintance with the project code, we realized that it was cheaper and more reliable to write our bicycle.

Let's list again what we want from a new tool:

predictability of behavior;
fault tolerance;
scalability;
portability;
support a large number of nodes;
Capabilities support;
on-demand Node in Kubernetes;
collecting metrics in statsd;
mechanism / grid / register;
upgrade without stopping tests.

What eventually happened:

an application that solves all the above tasks;
cross-platform application, tested on linux and macos;
written on Go;
stores data in mysql.

As a result, we managed to solve all the problems. The application was written on Go. The stateless application itself - sessions are stored in mysql, if desired, it is not difficult to maintain any other database. On-demand container creation is implemented only in Kubernrtes, but you can send pull requests with the implementation of container creation / deletion methods in any other system. Go compiles for different platforms, but it was enough for us to test the performance only on linux and macos, in theory other systems should be no problem.

Now the main question. How many lines of test code did we have to rewrite during the transition to this tool? Who thinks that 10,000/1000/100? Zero! Nothing needed to be rewritten, it is fully compatible. You just need to close the application and specify its address, and that's it. Nothing more to do.

The result was the following scheme:

How to use it? There are 2 modes:

Persistent - here everything is as before, start the Selenium server with the -role node parameter, indicate where the hub address is, the node is registered, you can use:

java -jar selenium-server.jar -role node -hub http://127.0.0.1:4444/grid/register

On-demand - in the configuration of the grid you need to add docker-images and information about what features they implement. Then run the grid, request the session, the node itself is created in the cluster.

 ... "type": "kubernetes", "limit": 20, "node_list": [ { "params": { "image":"myimage:latest", "port": "5555" }, "capabilities_list": [ { "browserName": "firefox", "browserVersion": 50 ...

Total

We have been using this solution in production for quite a long time, it works and does not require any support. In the process, we once again convinced that you should not be afraid to make bikes. Popular solutions are not always good, you should always investigate possibilities for solving problems.

Source: https://habr.com/ru/post/352208/

All Articles

How to perform many UI tests in parallel using Selenium Grid?

Known solutions

Study

Final decision

Total

More articles: