Farm management for Android devices. Lecture in Yandex
The more popular your application, the longer the list of devices on which it is launched. At some point, this becomes a problem: some bugs mysteriously reproduce only on a specific model, and you have to test the product on an increasing number of devices.
The problem of supporting multiple devices can be solved with the help of farms. The report explains what these farms are and how to integrate them into the development and testing process.
- My name is Pavel Novikov, I work in the company "New cloud technologies." We are working on the product "My office", we are doing the Android version of this office suite. The application itself is very large. To begin with, I will tell you how everything is arranged in terms of architecture, so that after that you can go on to the reason why the farm was needed. You can understand if you have events when there is a lot of code and external dependencies, or there are no such events and whether you need it. You can more accurately decide that the farm is cool and necessary. Or not. ')
The previous speaker said that SKI uses flutter inside. We also use, we have all the native, all documents are fully drawn in SKI. Stuck very fast, good library.
In the center of the application "My Office Documents" is the native core, CORE, which is done inside our company, but is done by a whole team. It is written in pure C ++ so that it can be reused later. It is reused by all teams that work on our product line.
This desktop, web, Android, iOS. Even Tizen is, his full and very successfully done.
The most important thing is that the guys who make CORE are very closely tied to everyone who uses this CORE. Often there are situations when we ask for functionality that would be useful, for example, on mobile clients, on iOS, on Android, but on the desktop it is not so useful. There are such things.
From this, it is very predictable that bugs sometimes appear - due to the fact that the desktop asked for this function and it inside itself slightly changed the behavior of another function that is used on Android. This is quite a common situation, and our task is to quickly find such places to help children, work together and quickly fix bugs, for example, on all mobile platforms.
There are three components: the native CORE, the Android library in which this CORE is wrapped, and our Java application. Native CORE is pure C ++. Further, in the form of source codes, it goes to the Android module, where all the bindings for the native are made, small layers are added, some kind of logic. Plus, all the drawing that SKI uses is also done in this module. Then the module as a standard AR-library is inserted into our application, we use it.
Anything happens - there may be bugs in the native, in CORE, in the bindings. The same unsatisfying link exception, when the Java class is incorrectly assigned to the native class and received a situation in which the application crashes during the call to the native method. And the bugs are in our Java code.
The most important thing is that we want to understand as soon as possible where the problem is so that it is easier to solve it. The bugs that are caused in the binders, it would be good to catch immediately, so that they do not reach the UI, manual testing. This time and quite simple errors that are easy to find and fix.
To do this, we need tests - the best that can be in this situation. Our task is to make the code running the application. The easiest way to see that it works is to write tests for it. Tests are completely different. We write everything: unit, UI, integration tests.
In addition to the fact that there are tests, it is necessary that they still run. The tests that you wrote when you developed the application, and then forgot about them and do not run, they lie, there is no use to them, and after that the code changes. Two months passed, and we decided to run these tests, but they did not even come together. This is wrong, tests should work all the time.
Tests should run regularly on the CI. Otherwise they are meaningless. If you have tests that are lying, then you have no tests.
With unit tests, it's pretty simple. Run the grlu test, taska. It runs on CI - everything is fine, unit tests are run in your application, you see reports, reports.
Android tests run as connect Android tests, they need the same unit tests, only they are chasing on devices. And here a problem arises: UI and integration tests should be run on real devices. And CI is not a real device. You can solve this problem in several ways.
For example, connect a device to a CI server. I have not tried it myself, but it should work, why not. You have a server, it stands on the next table or under the table, you connect devices to it, the device with the system sees them, everything is fine, everything starts.
You can run the emulator on the CI. This is a pretty working version, the same Jenkins supports the plugin that allows you to run the emulator, but the problem is that the emulator is most likely the 86th emulator. And if we are talking about integration tests, in our case, by integration I mean external dependencies, in particular, native code, because we have a lot of native code. And by integration tests, I understand tests that verify the logic of “pluses”.
As a result, it is possible to make it, but not very convenient. The option to connect devices directly is not very convenient, and the emulator will not help us. You can use the ancient "Armovsky", but this is a so-so idea.
This is where the Open STF project comes into play.
We have a lot of devices, let's connect them all to one computer, and learn to walk on this computer so that it is possible to work with all devices centrally.
It looks like this. The picture is taken from the project itself. The list of devices is available, we can connect to everyone and work with it fully. What you see is a live device, which is connected to the farm and which can be fully controlled via the web interface, to fully work.
The Open STF project is open source, and it has several advantages. First of all - work with real devices. Like most Android developers, you understand that your code should be scanned on devices. The emulator is good, but there are many things that need to be checked on real devices: the same native, work with SSL. There are many things that can be different. Farm solves this problem.
What is nice, to work with this farm, you do not need root on the device. You simply connect the device to the farm, and it is available for work.
This is a convenient debug. Nothing prevents you from connecting via DB to a device, apparently on a network using a simple IP, to work with it as with a device that is supposedly connected to your work computer. You always see the screen of the device, you can interact with it - just with your mouse, not with your finger.
The device can use several commands. In our case, this is an interesting case. We have an extensive fleet of devices in the company and several teams that work with them. We are developers, testers, we all sit side by side, and we need devices for work. The second team is automated, they are sitting on another floor, but they write automated tests with their technology stack for everyone, including Android. They also have access to all devices in the company. Third - support service. They also need as many devices as possible. When they write about the problems in the product, they need to reproduce them. Problems may be different. The advantage is that they have access to all devices of the company. This is a plus, you can run applications on devices and provide faster support.
It is convenient to work remotely. A pleasant consequence, QA do this: if someone is unwell or cannot go to the office, this does not mean that they will not be able to work. They go to the farm and work as usual, as if the device was lying next to them.
We work on a scram, periodically we hold a demo. For about a year now we have been carrying out all the demo of Android teams only on the farm. It is really convenient when you need to show several features to be made, several stories, display device screens in broadcast, show one story, the next story about the tablet, switch to another device and show the tablet. This approach saves time and is more convenient than with real devices.
There is a Rest API, you can think about automation.
All devices are in order, in one place, always charged. It happens that you need to play something on one device, but it is lying around FIG understand where it is, discharged ... A nice bonus.
Like any project, there are drawbacks. Not all devices are supported. I can not name the exact rules. It happens, connect the device to the farm and it is not defined. This happens very rarely, we literally had one or two devices. 95% of the whole park works fine. There is an exception with some Chinese - and that's not defined. One device on the 86th processor, FIG knows why.
Not very convenient to update. On the issue of upgrading the STF product itself: since this is an open source, our team is responsible for updating our company. It is not easy to click and refresh. But nothing is impossible. Since this is about open source, you can ease the process, the problem is not critical.
To release outside the internal network is undesirable. We have this farm spinning inside the network, and preferably not to shine on the Internet, because the farm allows you to get, in fact, full access to the device, there are no restrictions, you just work with the device - you can delete anything, add anything . If something can be done with a real device in your hand, you can do it with a farm. So it is better to leave it for internal use.
What does this look like? Externally, there is a server where the farm itself is running, a web panel that is accessed.
This web panel knows about nodes. Each node is a computer to which the device is connected. A node can be several - to the question of scaling. The devices themselves are connected not to where the web version is running, but to the nodes. In our case, it looks like the two nodes are in our team and one more - in support, just because they are closer. Physically, not all devices are located in one place, but all are accessible through a single interface that looks like this.
To all devices it is written what product it is, what OS version, SDK level, what architecture this device has. And its location is the provider I was talking about. There are two providers. These are our devices and support devices. Last we try not to touch, it is their device, accessible through a single interface.
There are all the instructions on how to pick it up, run it, work with it. All the pros and cons are described, the project is pretty well documented.
The problem was that it was necessary to somehow make friends of these two things. There are tests and a farm.
A CI server is any service you like. We use Jenkins, I have examples with an interface about Jenkins, but you are not tied to anything.
You have a STF server — the server itself, the provider, the devices.
How to combine them? Obviously, the easiest way is the Gradle-plugin, which allows you to configure the connection to the farm when running tests.
What does he do? Pretty basic stuff. He will select the devices that you need to run the tests, connect to them before launching and turn off upon completion, because it is not good to keep the devices locked.
What is the right device? Through the plugin, you can flexibly configure exactly which device you need. You can filter them by name, take one Nexus or Samsung, choose the number you want to filter. This may be one small set of tests - you say that I want to drive on two devices and make sure that nothing has broken off. Or do a nightly run that will take all the devices, check, start everything, everything will be displayed.
Architecture. It happens, you need to run tests on a specific architecture. There are cases, but this is rarely necessary.
The provider is useful for us, we promise not to touch the devices of our support, in order not to spoil anything, not to interfere with each other. We can say: do not touch the device at the support.
Still useful to sort by API level. If you want to run a test for API 21 and higher for some reason, this is possible.
Connecting is pretty simple. Like any Grandle plugin, it integrates through a similar syntax. Write apply plugin, it will appear in accessible.
Now made the next step. At startup, you need to attach to the test run test, which will run on CI, so that the plugin works. Now done this way. Maybe uncomfortable, but as it is. Improving is not a problem. The main thing is that you can attach to connectCoSmartphoneTestFarm. This is the main task, responsible for connecting to the device and releasing the device.
Well, the third - setting the parameters of the farm. baseUrl - the path where the farm is located. apiKey is the key to connect via REST; this is configured in the farm console. adbPath — so that the adbConnect operation is performed on all devices that will be found. Timeout - system setting, by default it costs a minute. It is necessary for the farm itself to release the devices, if for some reason they are not used.
This is how the tests run using the farm looks like. We say that connectedDebugAndroidTest will run all your tests, and here the parameter is given not to use the support. Tilde is negation in this case. Then say that I want five devices, and that they all were –DK21, that is, Lollypop and higher.
This is how it looks when setting up a job inside Jenkins. Here these parameters are configured and transmitted. This is not part of the plugin, the job on Jenkins needs to be done by yourself. You can not specify all these parameters, and make one job in which they will be set iron. And just the build button, if you don't want to bother. Maybe we will do the same in the future.
As a result, after running all the tests, you see the most standard HTML report running GUnit, with only one aspect: you will see that they run on different devices. You will see the names of all the tests that you ran, and realize that they run on each device. You will even see how much they are launched in time, in order to build an analysis from this in order to look for some kind of regression. There is a flight for fantasy - you can think of a test that will run the same code a hundred times and measure it. And you will see how the code on the 86th or on the ARM works: faster or slower. The farm will help in this, so that you can not connect it with your hands, but automatically.
The plugin itself is also available on GitHub , there is simple documentation, but at least some. Easy to connect. All feedback is welcome. We wrote the plugin for ourselves. This is the only reason why we could not normally use the farm. Finally we could, rejoice at this.
It is worth mentioning that Gradle is not everywhere for objective reasons. For example, it may be Appium. I mentioned earlier that we have a team of automatists who write their tests on Appium technology. Gradle didn't smell there, but they also need to use a farm.
This may be a terminal. There is a farm, some crash occurred on the device, and it would be good to get a log cut from it, download the file, whatever. What to do? Either take the device and connect to it - but then all the magic of the farm is lost, - or use some additional client.
We have developed a simple tool that does the same thing, but works through the terminal. You can also connect to devices, disconnect, display them in the list, connect so that they are available in adb, and this command says: we need five Nexus, when you find them, connect to everyone. After executing the command, you will have five devices available in adb. You can whatever you want to do from the terminal, also convenient. The main advantage is faster than making hands. And also available on GitHub .
Technically, the Gradle plugin and the client use our STL client library together. The entire server is written in Java, there are further plans to add a plugin for the studio, so that devices can be selected directly from the UI studio when you are working. According to my own feelings, for the past six months I did not touch the device with my hands. The devices are on the farm, I connect to them via the web interface, connect to adb, copy the path on the farm and do not touch the device with my hands - lazily. Just connected to another device - working with another.
I do not feel much discomfort during development. Slowly I notice that in our team other developers do about the same. The only QA kick, they say - uncomfortable.
A little to the side, but also about the tests. I mostly use the farm in the context of testing. UI tests are not integration tests. To some, this may seem like captaincy, but it implies that UI tests may depend on devices. I'm talking about rapid tests that should be run only on smartphones that test the screen on a smartphone. On the tablet, it does not make sense, and vice versa. There are specific tests for tablets, for tablets. On the smartphone, they either should not be launched, or should be launched in some modified place. If you run, you will get a test that has been logged in, and the output will be a false-negative trigger, which will interfere. In theory, the tests should either pass, and all is well, or not pass for objective reasons. And if they look as if they had passed, but passed for biased reasons, this hinders the process, information is lost.
The task is to separate them. This is what we faced when we began to integrate the whole process of working with the farm and all the described automation. We have both an integration test and UI tests.
There are several ways. The simplest are known. And all the workers, to whom it is more convenient.
You can write your own test runner, which will analyze, for example, the names of classes. The working version is quite. Agree that you name classes that end in TestIntegration or TestUI. Quite a working option - test runner it resolves.
You can do a little mess with Gradle. , Gradle, . Stack Overflow , .