Test it: how do we determine which tests to run on pull requests?

Hi, Habr! My name is Egor Danilenko. I am developing a digital platform for corporate Internet bank Sberbank Business Online, and today I want to tell you about the CI development procedure adopted by us.

How does developer change go to the release branch? The developer makes changes locally and pushes into our version control system. We use Bitbucket with the author plugin (we wrote about this plugin earlier here ). These changes start the build and run tests (unit, integration, functional). If the assembly has not collapsed and all the tests have been completed successfully, and also after a successful review, the pull request is merged into the main branch.

But over time, the number of teams increased. Proportionally increased the number of tests. We understood that such a number of teams would speed up the onset of the problem of a “slow pull-request check”, and it would be impossible to develop a product. At the moment we have about 40 teams. Together with new features, they bring new tests, which also need to be run on pull requests.

We thought that it would be cool if we knew which tests to run under changing a specific piece of code.
')
And that's how we solved this problem.

Formulation of the problem

There is a project with tests, and we want to determine which tests to run when “affecting” a particular file.

We all know about the library to cover the JaCoCo code from EclEmma. We took it as a basis.

A little bit about JaCoCo

JaCoCo is a library for measuring code coverage of tests. The work is based on the analysis of bytes of code. The agent collects execution information and unloads it upon request or stop of the JVM.

There are three modes of data collection:

File system: after stopping JVM data will be written to a file.
TCP Socket Server: You can connect external tools to the JVM and get data through a socket.
TCP Socket Client: at startup, the JaCoCo agent connects to a specific TCP endpoint.

We chose the second option.

Decision

You must implement the ability to run the applications and the tests themselves with the JaCoCo agent.

First of all, add to gradle the ability to run tests with the JaCoCo agent.

Java agent can be started:

-javaagent:[yourpath/]jacocoagent.jar=[option1]=[value1],[option2]=[value2]

Add a dependency to our project:

 dependencies { compile 'org.jacoco:org.jacoco.agent:0.8.0' }

We need to run with an agent only to collect statics, so we add a withJacoco flag with default default value to gradle.properties. We also prescribe the directory where statistics, address and port will be collected.

Add the formation of the jvm argument with the agent to the test run:

 if (withJacoco.toBoolean()) { … jvmArgs "-javaagent:${tempPath}=${jacocoArgs.join(',')}".toString() }

Now we need to collect statistics with JaCoCo after each successful completion of the test. To do this, we write TestNG listener.

 public class JacocoCoverageTestNGListener implements ITestListener { private static final IntegrationTestsCoverageReporter reporter = new IntegrationTestsCoverageReporter(); private static final String TEST_NAME_PATTERN = "%s.%s"; @Override public void onTestStart(ITestResult result) { reporter.resetCoverageDumpers(String.format(TEST_NAME_PATTERN, result.getInstanceName(), result.getMethod().getMethodName())); } @Override public void onTestSuccess(ITestResult result) { reporter.report(String.format(TEST_NAME_PATTERN, result.getInstanceName(), result.getMethod().getMethodName())); } }

Add a listener to testng.xml and its comments, since we do not need it at the usual test run.

Now we have the opportunity to run tests with the JaCoCo agent, with each successful test statistics will be collected.

A little more detail about how the reporter for collecting statistics is implemented.
During the initialization of the reporter, it connects to the agents, creates a directory where the statistics will be stored and the collection of statistics itself.

Add a report method:

 public void report(String test) { reportClassFiles(test); reportResources(test); }

The reportClassFile method creates a jvm folder in the statistics directory, which stores statistics collected from class files.

The reportResources method creates a resources folder, which stores collected statistics on resources (for all non-class files).

The report contains all the logic for connecting to the agent, reading data from a socket and writing to a file. Implemented by tools that JaCoCo provides, such as org.jacoco.core.runtime.RemoteControlReader / RemoteControlWriter.

The reportClassFiles and reportResources functions use the common function dumpToFile.

 public void dumpToFile(File file) { try (Writer fileWriter = new BufferedWriter(new FileWriter(file))) { for (RemoteControlReader remoteControlReader : remoteControlReaders) { remoteControleReader.setExecutionDataVisitor(new IExecutionDataVisitor() { @Override public void visitClassExecution(ExecutionData data) { if (data.hasHits()) { String name = data.getName(); try { fileWriter.write(name); fileWriter.write('\n'); } catch (IOException e) { throw new RuntimeException(e); } } } }); } } }

The result of the function will be a file with a set of classes / resources that this test affects.

And so, after running all the tests, we have a directory with statistics on class files and resources.

It remains to write a pipeline for the daily collection of statistics and add pull-requests checks to the pipeline.

Stage project assemblies are not interesting for us, but we will take a closer look at the stage for publishing statistics.

 stage('Agregate and parse result') { def inverterInJenkins = downloadMavenDependency( url: NEXUS_RELEASE_REPOSITORY, group: '', name: 'coverage-inverter', version: '0', type: 'jar', mavenHome: wsp ) dir('coverage-mapping') { gitFullCheckoutRef '', '', 'coverage-mapping', "refs/heads/${params.targetBranch}-integration-tests" sh 'rm -rf *' } sh "ls -lRa ..//out/coverage/" def inverter = wsp + inverterInJenkins.substring(wsp.length()) sh "java -jar ${inverter} " + "-d ..//out/coverage/jvm " + "-o coverage-mapping//jvm " + "-i coverage-config/jvm-include " + "-e coverage-config/jvm-exclude" sh "java -jar ${inverter} " + "-d ..//out/coverage/resources " + "-o coverage-mapping//resources " + "-i coverage-config/resources-include " + "-e coverage-config/resources-exclude" gitPush '', '', 'coverage-mapping', "${params.targetBranch}-integration-tests" }

In coverage-mapping, we need to store the file name and inside it a list of tests that need to be run. Since the result of the statistics collection is the name of the test, which stores a set of classes and resources, we need to invert the whole thing and exclude unnecessary data (classes from the third-party libraries).

We invert our statistics and push into our repository.

Statistics are collected every night. It is stored in a separate repository for each release branch.

Bingo!

Now, when we run the tests, it remains to find the modified file and determine the tests that need to be run.

Problems we encountered:

Since JaCoCo works only with bytecode, it is impossible to collect statistics on files such as .xml, .gradle, .sql out of the box. Therefore, we had to “fasten” our decisions.
Constant monitoring of the relevance of statistics and the frequency of the assembly, if the nightly assembly has collapsed for some reason, then yesterday's statistics will be used for checking in pull requests.

Source: https://habr.com/ru/post/430270/

All Articles

Test it: how do we determine which tests to run on pull requests?

Formulation of the problem

A little bit about JaCoCo

Decision

More articles: