Test automation: Acronis Kernel “drone”

( http://bp-la.ru/bespilotnyj-apparat-danem )

Build => Test => Not passed => and kilometers of logs scattered across different systems, and tens of minutes mixing ends together in search of the cause of the failure. Familiar?

And if not?

Build => Test => Failed => Ticket in JIRA - and the developer takes the bug to work, because he already has all the information.

Working in the Acronis Kernel team, I set out to create just such an autotest.
Under the cut - my story.

Introduction

Software testing is a study to provide interested parties (Stakeholders, hereinafter Customers) with information about the quality of a product or service ( from Wikipedia ).
Customers perceive test results in different ways:

A product manager looks at which product features are ready for release, and which ones will have to be simplified \ postponed \ discarded.
The test manager is interested in the details of the study: types of tests performed, code / requirements coverage, elapsed time, detailed results, as well as any failures \ errors \ complexity that may adversely affect the objectivity of the test results.
The developer needs defects: clearly described, reproducible, including all the information necessary for fixing.
The tester receives the task, performs the test, analyzes the result, reports the bugs. If possible, expands the test coverage by adding new tests or test environments (environments).

Data should be available as soon as possible, ideally in real time, immediately after the appearance of a new product assembly.

Now we will present a workflow sufficient to fulfill the specified requirements:

Tests will start as soon as a new product is available;
The test execution time is determined by the development process adopted by the company, but must not exceed the average time between appearances of a new assembly;
Detected errors are automatically analyzed, already known errors are attributed to existing defects, the rest are recorded as new defects - within a few minutes after detection;
Test results are marked on the "Product Quality Card".

Here, in the Acronis Kernel team, we built such a process - not immediately, of course.
First I will tell you where we started.

Prehistoric

( http://spongebob.wikia.com/wiki/Primitive_Sponge )

Machinery

[bike] Control Center (SS) - a task scheduler written in Python, also stores test plans and draws reports.
[bike] AutoTest Management System (ATMS) - java-based virtual and physical resource manager
[bike] Custom DSL to set up a test environment
[bike] Custom Python unittest-based framework for writing tests with XML configs. Here and there, the logic of the test was built into the configs.
[bike] Custom version TestLink - here, in theory, should live detailed descriptions of test cases and the results of their implementation. In fact, it was used mainly to get a unique script ID (group of tests)
ESX-i virtual machines

It all worked like this

There is a new build.
CC noted the appearance of the assembly, and, according to the test plan stored in it, created new tasks in Testlink.
ATMS found tasks in TestLink and requested resources for them from the hypervisor. There were no queues of tasks: who managed to seize the resource, he is right.
Having obtained the required VM set, ATMS configured the Guest OS in them. The configuration recipe was defined as a custom DSL.
Then the control was transferred to the Python library, which completed the configuration of the environment, deployed the build and ran the tests.
Upon completion of the tests, the ATMS collected logs and test results, updated the task status in TestLink.
CC saw the completion of the task in TestLink, retrieved the results, updated its statistics database and sent a report on the test results by letter. Later, the Control Center took over the functions of TestLink, and the tasks were created in its internal database, which emulates Testlink for the client - ATMS.

Tests went on for several hours, often giving a random (non-replicable) result. For the analysis of the files, there was a whole quest with a visit to ATMS, CC, balls with logs, a detailed analysis of the logs and a search for similar bugs in Jira - all hand to hand.

Registered failures are sometimes reproduced, more often - not. For most of the bug fixes, the developers asked to clarify the steps, provide a virtual machine or attach forgotten logs.

About once a week, the ATMS fell. If the test hung, or for some other reason the resources did not become free, you had to manually delete the virtual machines, remove the task in the ATMS and reset the host busy count.

It was possible to compare the results of tests on different assemblies by static email reports with a graph of the Result Type / Build Number, or by selecting the results manually in the CC. To compare the results of the same test on different operating systems, I had to manually view the test logs from each OS.

As a result, developers did not trust autotests, relying more on manual launching of their own tests on their environment. This "mechanization" did not suit me at all, the situation had to be corrected.

Brave new world

( http://dkrack.wikispaces.com/Brave+New+World )

The architecture of the new autotest system was based on:

Jenkins - task scheduler, resource manager, custodian of test history and detailed results
ESX-i virtual machines
Python, Pytest - search for tests by tags, parameterized launch, execution control and output of the result in junit.xml format (standard format for Jenkins)
JIRA - test results in the form of bugs, project success metrics

0 (zero) bicycles.

Task path

With a successful build, the build server (also Jenkins) starts the project on the test Jenkins, putting the test in the queue.
Test Jenkins reserves resources (VM linked clone), downloads the latest test code from SVN, runs the CMD script to set up the environment, and calls pytest.
Pytest using the built-in test discovery function selects cases and starts the test. The framework code is executed on the Gate VM, the control machine, and the System Under Test (in our case, the kernel driver) is deployed on the Test VM, so as not to lose results in the BSOD case.
- The standard python logging library writes the info log and the debug log into two different files:
  a) Info log contains test steps and meets two requirements: 1) human readable format, 2) there is enough information to reproduce the failure.
  b) Debug log includes timestamp, address \ line number of the executable code and the expanded message. The log allows you to track a detailed history of events that are not directly related to the essence of the test, but affecting the result: whether it was possible to establish a connection, how much time the reboot took, etc.
- The test stops when the first failure is detected (the result is assert = False). Pytest writes the result + trace to junit xml.
Jenkins (JUnit Plugin) publishes a report and starts the python script for reporting bugs.
The script searches for already known open bugs in Jira, if it finds it - leaves the comment "Reproduced there somewhere", if not - it registers a new bug. The error message (pytest assert) goes into the header, the steps from the Info log into the description, the test logs and the drivers themselves will attach to the bug.

I will give the scheme for clarity:

The name of the bug is added with a suffix to the name of the VM, so developers can easily find a car if necessary. The machine on which the already known bug was reproduced will be automatically removed after three days. The machine with the new bug will be automatically removed after the developer translates it into Resolved status, and the corresponding test passes without errors.

An example of an automatically activated bug

Previously, the automator had to spend 80-90% of the time on manual analysis of test results. Now just look at the list of bugs in Jira. The product bug goes to the developers, the automator takes the test fails. If there is not enough information in the bug report, you don’t need to teach people to get bugs differently - just change the code.

An example of developer communication with an automatic bug reporter

Support for tests has been reduced to processing in the code of yet unaccounted types of failures. Corner cases will always be there, this should be understood, and you should not aim at getting rid of 100% of the failures of the auto-test / test infrastructure. It is enough to turn these failures into specific action items - bugs in Jira, in our case, and fix them one by one.

Product Quality Card

A general overview of the state of the tested components can now be obtained by looking at the Jenkins dashboard:

Dashboard implemented using the plugin https://wiki.jenkins-ci.org/display/JENKINS/Dashboard+View .

Maybe not all readers are familiar with Jenkins, so I’ll explain the values of the columns:

S (Status) - the result of the last build (in our case, the test);
Name - the name of the test;
W (Weather) - iconography, showing the history of the quality of the assembly, 5 assemblies back. The sun means that all 5 assemblies are successful, a thundercloud — all 5 assemblies are bad;
Build Parameters - in our case the path is specified, corresponding to the code branch and containing the build number;
Last Duration - the execution time of the last build, starting from the moment the order is placed in the queue, until the moment when the logging from the last environment is completed and the report about the test results is sent;
Build Description - in the description of the assembly, the autotest adds the numbers of the bugs automatically instituted in Jira, indicating whether it is new (new) or already known (upd);
Last Success, Last Failure - how long ago the last successful / unsuccessful build was made.

results

We built and debugged the system I described above by the end of last fall, and then actively added new scripts for testing. From February 2016, I switched full time to another project.

During my absence (six months):

129 correct bugs were found and automatically generated - approximately one new bug every working day.
From other sources, 48 bugs.

The project has lived for six months and has been developed by the efforts of developers only, without a single tester. The developers have independently added a new component, creating Jenkins projects and Pyhton code by analogy with existing ones.

Incorrect bugs during this time, too, quite a lot, mostly duplicates, born of an incorrect new test setup or test server failures. However, this is a topic for a separate article.

Source: https://habr.com/ru/post/282682/

All Articles