📜 ⬆️ ⬇️

Takari: Maven on steroids

Prehistory


I have been working for quite a long time in a project that uses the Maven build system. At first, when the project was not as large as it is now, the time it was fully compiled was relatively reasonable and did not cause any complaints. But over time, the code has grown, the number of subprojects has increased dramatically, and the average full compilation time has grown to 6 - 10 minutes. What served as a constant source of reproaches from the developers.

It should also be noted. that we did not use parallel assembly, since this regularly caused various problems. Then the artifacts in the local storage will be beaten, then it will simply collect in the wrong order and the old, uncompiled code will get into the final WAR artifact. Of course, some developers used parallel build at their own risk. But sooner or later they got into a situation where they could not figure out what was going on. And a simple recompilation into one stream immediately helped.

This went on for quite a long time, until I came across a rather curious website of Takari, which suggests ways to improve the methods of working with Maven.
')
Three things are most interesting there:

They also have a Maven Wrapper on their GitHub (an analog of the wrapper from Gradle).

Looking ahead, I note that the tools described here not only solve the problem of incorrect operation of Maven, but also give a significant increase in assembly speed.

Concurrent Safe Local Repository


This enhancement is intended to solve the problem of broken artifacts in the local repository.

The fact is that in Maven, working with local storage (in fact, a directory on the file system) is implemented in a non-thread safe manner. Those. if parallel projects that are being collected start simultaneously pumping out the same dependency, the result is a broken file. Exactly this problem is solved by this addition.

In order to use it, you must modify the directly installed Maven itself:

curl -O https://repo1.maven.org/maven2/io/takari/aether/takari-local-repository/0.10.4/takari-local-repository-0.10.4.jar mv takari-local-repository-0.10.4.jar $M2_HOME/lib/ext curl -O https://repo1.maven.org/maven2/io/takari/takari-filemanager/0.8.2/takari-filemanager-0.8.2.jar mv takari-filemanager-0.8.2.jar $M2_HOME/lib/ext 

Everything. No further action required. Now all operations with the local repository will be safe. In itself, this extension can only be used on CI servers, when multiple assemblies occur at the same time and you want to use one repository to save space. But for an ordinary developer, it is more interesting to use in conjunction with Smart Builder, which works on the assumption that this extension is already installed.

Experience has shown that when using this solution, the assembly starts to work a little slower, but more reliably.

Takari smart builder


This extension is installed in the same way as the previous one:

 curl -O https://repo1.maven.org/maven2/io/takari/maven/takari-smart-builder/0.4.0/takari-smart-builder-0.4.0.jar mv takari-smart-builder-0.4.0.jar $M2_HOME/lib/ext 

And provides a more advanced algorithm for paralleling the assembly of Maven projects. The difference in the work of the standard maven assembly planner and Smart Builder is illustrated in the diagram below:



Maven's standard paralleling strategy is simple and naive. It is based on the calculation of the depth of dependencies. Maven launches a parallel build of all projects of the same level until they run out and only then goes on to the next level.

Takari Smart Builder, in turn, uses a more advanced strategy. It calculates dependency chains, performs topological sorting and only after that makes a decision about the sequence in which it is necessary to assemble the projects.

Moreover. During the compilation process, he remembers the compilation time of each project into the .mvn / timing.properties file and uses it as additional information to complete the compilation the next time as quickly as possible.

In order to use this functionality, you need to specify an additional key when starting Maven. For example:

 mvn clean install --builder smart -T1.0C 

Everything gets easier with Maven 3.3.1


In the version of Maven 3.3.1 was implemented several innovations. First and foremost, the ability to declare Maven kernel extensions right in the project. To do this, add the file .mvn / extensions.xml . In the annex to the previously described, this file may have the following form:

 <?xml version="1.0" encoding="UTF-8"?> <extensions> <extension> <groupId>io.takari.maven</groupId> <artifactId>takari-smart-builder</artifactId> <version>0.4.1</version> </extension> <extension> <groupId>io.takari.aether</groupId> <artifactId>takari-local-repository</artifactId> <version>0.11.2</version> </extension> </extensions> 

Now we don’t need to report libraries directly to the Maven distribution. In this case, we get the same result.

The extensions.xml file is not the only one possible in the .mvn directory. There may be two more files: jvm.config and maven.config .

jvm.config contains JVM options for running a compilation of the current project. For example, this file might look like this:

 -Xmx2g
 -XX: + TieredCompilation
 -XX: TieredStopAtLevel = 1

The first option sets the heap size to 2 GB, and the next two optimize the JVM operation for the needs of Maven (peeped here ).

maven.config is another file with parameters, but this time for Maven itself. For example:

 --builder smart
 -T1.0C
 -e

Thus, we can specify that the default smart builder is used with the number of threads equal to the number of logical cores. Ie, if we just execute

 mvn clean install 

this build will be performed in multiple threads and using all extensions and optimizations. Moreover, even if we perform the assembly of the nested module, these settings will still be applied, since Maven searches the .mvn directory not only in the current directory, but also in the parent directory.

Here, however, there is one nuance. Since the assembly goes in several streams, then the assembly log is displayed by these streams concurrently. As a result, when problems arise, it is not always clear what is happening, due to the fact that the lines are mixed. In this case, if you want to run the assembly in one thread and understand the causes of trouble, you have to manually switch the assembly to single-threaded mode:

 mvn -T1 clean install 

The takari lifecycle


The Takari Lifecycle is an alternative to the default Maven life cycle (building JAR files). Its distinctive feature is that instead of five separate plug-ins for one standard life cycle, one universal is used with the same functionality, but with a much smaller number of dependencies. As a result - a much faster start, more optimal work and less consumption of resources. That gives a significant performance boost when compiling complex projects with a large number of modules.

To activate the upgraded life cycle, you need to add the takari-lifecycle-plugin as an extension of the assembly:

  <build> <plugins> <plugin> <groupId>io.takari.maven.plugins</groupId> <artifactId>takari-lifecycle-plugin</artifactId> <extensions>true</extensions> </plugin> </plugins> </build> 

And also override the assembly of JAR modules as takari-jar:

  <project> <modelVersion>4.0.0</modelVersion> <groupId>io.takari.lifecycle.its.basic</groupId> <artifactId>basic</artifactId> <version>1.0</version> <packaging>takari-jar</packaging> 

After that, all projects such as POM, as well as takari-jar projects will be collected using the new life cycle.

You can also enable this life cycle for all JAR modules (see the documentation), in our case it has begun to lead to conflicts with various Maven plugins. As a result, it was decided to simply redefine the packaging modules, where this can be done without affecting the assembly. As practice has shown, this turned out to be more than enough.

It should also be noted that when using the takari-lifecycle-plugin extension, the location of the various assembly settings changes. They move to the configuration section of this plugin. For example:

  <plugins> <plugin> <groupId>io.takari.maven.plugins</groupId> <artifactId>takari-lifecycle-plugin</artifactId> <configuration> <source>1.8</source> <target>1.8</target> </configuration> </plugin> </plugins> 

More information can be found in the documentation. .

Takari maven wrapper


Takari has another nice thing - Maven Wrapper . By analogy with Gradle Wrapper, it allows you to run a project build immediately after cloning. Without the need to install and configure Maven on your computer. In addition, it allows you to secure the necessary version of Maven for the project.

The easiest way to add a vrapper to your project is to use the archetype. Perform in the project root:

 mvn -N io.takari:maven:wrapper 

After that in the current directory we will have two scripts:


As well as the wrapper itself and its configuration file will appear in the .mvn / wrapper directory.

Everything. After that you can call:

 ./mvnw clean install 

And if you need another version of Maven, you can set the required URL in the configuration .mvn / wrapper / maven-wrapper.properties .

And again, this is not without nuances. So in organizations with a closed network, Maven proxying repositories such as Nexus or Artifactory are often used. In this case, each developer is forced to separately set up a Maven mirror for this repository. Which is a bit contrary to the wrapper's ideology - no need for any settings.

Exit the situation as follows: create in our project an .mvn / settings.xml file of the form

 <?xml version="1.0" encoding="UTF-8"?> <settings> <mirrors> <mirror> <id>nexus-m2</id> <mirrorOf>*</mirrorOf> <url>http://repo.org.ru/nexus/content/groups/repo-all-m2</url> <name>Nexus M2</name> </mirror> </mirrors> </settings> 


and add a line to the .mvn / maven.config file

 - global-settings .mvn / settings.xml

As a result, the mirror will start to pick up automatically.

Testing and results


All of the above would not make sense if it did not give impressive results on speeding up the assembly of the project. And not to be unfounded - I will give the results that were obtained in one of our working projects.

So we have:

Since in the “vanilla” Maven with a multi-threaded assembly, problems were observed, then (almost) only one stream was always used, which ultimately led to an assembly time of 5:32 (5 minutes 32 seconds) and above. After all the optimizations (parallel build + takari lifecycle) the build time turned out to be 1:33. Almost 4 times!

All intermediate results are tabulated below:
How was goingNumber of threadsTime spent
defaultone5:32
defaultfour3:25
smart buildfour3:18
smart build + takari-jarone3:23
smart build + takari-jarfour1:33

Smart Build was launched two times, and the second result was recorded, since after the first launch, the order of assembly execution may be optimized (see documentation).

Curiously, adding Takari Lifecycle in single-threaded mode gives the same performance boost as building in the 4th thread but on the “vanilla” Maven.

As a conclusion



I just recently discovered the tools described in this article. So the practice of using them is still very modest. Perhaps over time, any reefs will come out. But in any case, such a radical acceleration of the assembly turned out to be enough to risk using these opportunities in our process. Time will tell what happens.

I also want to note that in the github repositories of the company Takari there are some more interesting projects. Their description is beyond the scope of this article, but perhaps someone else will be interested in something else.

UPD


As already noted in the comments, feedback from developers began to come. It turned out that the mvnw.bat file does not perform its functions. A quick fix was made, which brought the functionality to the proper form:

Corrected script
 @REM ---------------------------------------------------------------------------- @REM Licensed to the Apache Software Foundation (ASF) under one @REM or more contributor license agreements. See the NOTICE file @REM distributed with this work for additional information @REM regarding copyright ownership. The ASF licenses this file @REM to you under the Apache License, Version 2.0 (the @REM "License"); you may not use this file except in compliance @REM with the License. You may obtain a copy of the License at @REM @REM http://www.apache.org/licenses/LICENSE-2.0 @REM @REM Unless required by applicable law or agreed to in writing, @REM software distributed under the License is distributed on an @REM "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY @REM KIND, either express or implied. See the License for the @REM specific language governing permissions and limitations @REM under the License. @REM ---------------------------------------------------------------------------- @REM ---------------------------------------------------------------------------- @REM Maven2 Start Up Batch script @REM @REM Required ENV vars: @REM JAVA_HOME - location of a JDK home dir @REM @REM Optional ENV vars @REM M2_HOME - location of maven2's installed home dir @REM MAVEN_BATCH_ECHO - set to 'on' to enable the echoing of the batch commands @REM MAVEN_BATCH_PAUSE - set to 'on' to wait for a key stroke before ending @REM MAVEN_OPTS - parameters passed to the Java VM when running Maven @REM eg to debug Maven itself, use @REM set MAVEN_OPTS=-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000 @REM MAVEN_SKIP_RC - flag to disable loading of mavenrc files @REM ---------------------------------------------------------------------------- @REM Begin all REM lines with '@' in case MAVEN_BATCH_ECHO is 'on' @echo off @REM enable echoing my setting MAVEN_BATCH_ECHO to 'on' @if "%MAVEN_BATCH_ECHO%" == "on" echo %MAVEN_BATCH_ECHO% @REM set %HOME% to equivalent of $HOME if "%HOME%" == "" (set "HOME=%HOMEDRIVE%%HOMEPATH%") @REM Execute a user defined script before this one if not "%MAVEN_SKIP_RC%" == "" goto skipRcPre @REM check for pre script, once with legacy .bat ending and once with .cmd ending if exist "%HOME%\mavenrc_pre.bat" call "%HOME%\mavenrc_pre.bat" if exist "%HOME%\mavenrc_pre.cmd" call "%HOME%\mavenrc_pre.cmd" :skipRcPre @setlocal set ERROR_CODE=0 @REM To isolate internal variables from possible post scripts, we use another setlocal @setlocal @REM ==== START VALIDATION ==== if not "%JAVA_HOME%" == "" goto OkJHome echo. echo Error: JAVA_HOME not found in your environment. >&2 echo Please set the JAVA_HOME variable in your environment to match the >&2 echo location of your Java installation. >&2 echo. goto error :OkJHome if exist "%JAVA_HOME%\bin\java.exe" goto chkMHome echo. echo Error: JAVA_HOME is set to an invalid directory. >&2 echo JAVA_HOME = "%JAVA_HOME%" >&2 echo Please set the JAVA_HOME variable in your environment to match the >&2 echo location of your Java installation. >&2 echo. goto error :chkMHome if not "%M2_HOME%"=="" goto valMHome SET "M2_HOME=%~dp0.." if not "%M2_HOME%"=="" goto valMHome echo. echo Error: M2_HOME not found in your environment. >&2 echo Please set the M2_HOME variable in your environment to match the >&2 echo location of the Maven installation. >&2 echo. goto error :valMHome :stripMHome if not "_%M2_HOME:~-1%"=="_\" goto checkMCmd set "M2_HOME=%M2_HOME:~0,-1%" goto stripMHome :checkMCmd @rem if exist "%M2_HOME%\bin\mvn.cmd" goto init echo. echo Error: M2_HOME is set to an invalid directory. >&2 echo M2_HOME = "%M2_HOME%" >&2 echo Please set the M2_HOME variable in your environment to match the >&2 echo location of the Maven installation >&2 echo. goto error @REM ==== END VALIDATION ==== :init set MAVEN_CMD_LINE_ARGS=%* @REM Find the project base dir, ie the directory that contains the folder ".mvn". @REM Fallback to current working directory if not found. set MAVEN_PROJECTBASEDIR=%MAVEN_BASEDIR% IF NOT "%MAVEN_PROJECTBASEDIR%"=="" goto endDetectBaseDir set EXEC_DIR=%CD% set WDIR=%EXEC_DIR% :findBaseDir IF EXIST "%WDIR%"\.mvn goto baseDirFound cd .. IF "%WDIR%"=="%CD%" goto baseDirNotFound set WDIR=%CD% goto findBaseDir :baseDirFound set MAVEN_PROJECTBASEDIR=%WDIR% cd "%EXEC_DIR%" goto endDetectBaseDir :baseDirNotFound set MAVEN_PROJECTBASEDIR=%EXEC_DIR% cd "%EXEC_DIR%" :endDetectBaseDir IF NOT EXIST "%MAVEN_PROJECTBASEDIR%\.mvn\jvm.config" goto endReadAdditionalConfig @setlocal EnableExtensions EnableDelayedExpansion for /F "usebackq delims=" %%a in ("%MAVEN_PROJECTBASEDIR%\.mvn\jvm.config") do set JVM_CONFIG_MAVEN_PROPS=!JVM_CONFIG_MAVEN_PROPS! %%a @endlocal & set JVM_CONFIG_MAVEN_PROPS=%JVM_CONFIG_MAVEN_PROPS% :endReadAdditionalConfig SET MAVEN_JAVA_EXE="%JAVA_HOME%\bin\java.exe" @rem for %%i in ("%M2_HOME%"\boot\plexus-classworlds-*) do set CLASSWORLDS_JAR="%%i" set WRAPPER_JAR="".\.mvn\wrapper\maven-wrapper.jar"" set WRAPPER_LAUNCHER=org.apache.maven.wrapper.MavenWrapperMain %MAVEN_JAVA_EXE% %JVM_CONFIG_MAVEN_PROPS% %MAVEN_OPTS% %MAVEN_DEBUG_OPTS% -classpath %WRAPPER_JAR% "-Dmaven.home=%M2_HOME%" "-Dmaven.multiModuleProjectDirectory=%MAVEN_PROJECTBASEDIR%" %WRAPPER_LAUNCHER% %MAVEN_CMD_LINE_ARGS% if ERRORLEVEL 1 goto error goto end :error set ERROR_CODE=1 :end @endlocal & set ERROR_CODE=%ERROR_CODE% if not "%MAVEN_SKIP_RC%" == "" goto skipRcPost @REM check for post script, once with legacy .bat ending and once with .cmd ending if exist "%HOME%\mavenrc_post.bat" call "%HOME%\mavenrc_post.bat" if exist "%HOME%\mavenrc_post.cmd" call "%HOME%\mavenrc_post.cmd" :skipRcPost @REM pause the script if MAVEN_BATCH_PAUSE is set to 'on' if "%MAVEN_BATCH_PAUSE%" == "on" pause if "%MAVEN_TERMINATE_CMD%" == "on" exit %ERROR_CODE% exit /B %ERROR_CODE% 


It also turned out that the overall build under Windows is much slower than under Linux. Why is this related is not yet clear.

UPD2


Another subtle point has surfaced. The build for SonarQube conflicts with Smart Builder. Since By default, the --builder smart option is enabled , then it’s not enough to build under SonarQube

 mvn sonar:sonar 

You also need to switch back to the standard build strategy:

 mvn --builder multithreaded sonar:sonar 

or

 mvn --builder singlethreaded sonar:sonar 

depending on the situation.

Source: https://habr.com/ru/post/266011/


All Articles