Continuous Integration: The Case of Hudson

We all understand that testing is an integral part of the software development life cycle. The more often we test our code, the faster we will be able to detect an error that has crept into it during development, and the faster it will be fixed. It should be understood that it is highly desirable to conduct testing in an environment as close as possible to the battle (OS, software, hardware, load) in order to be able to detect errors that do not appear on the development server, but may appear in combat. Companioning the two theses above together we get a concept called Continuous Integration.

The essence of CI is to continuously (for example, after each commit'a) build and test software developed in as close as possible to the combat environment in order to detect errors as soon as possible and notify the developers about them. The very idea of CI belongs to Martin Fowler, who described it in detail in his article .

To automate the process of continuous assembly, there are ready-made solutions ( Hudson , CruiseControl ), the integration of one of which (Hudson) I will describe in this article.
')

Task

And so, let's say we have two projects: a Java service (from its own database), and a PHP client (from its own database) for it. Both projects are distributed as deb-packages. It is necessary to configure the infrastructure for the continuous integration of these projects.

Implementation

In order to have an idea of what we want to achieve in the end, let us start from the end: consider the scheme we want to implement:

programmer's work machine - writing code,
SVN server - code storage,
Staging server - installation and testing of collected projects,
Selenium server - testing web-interface,
Repo server - storage of collected packets,
CI server - connection of all system nodes into a single whole.

The developer makes changes to the project on his machine and commit'it them to SVN. On the SVN server, a post-commit hook is triggered, which initiates the build process of the corresponding project on the CI server. The CI server updates the version of the package from SVN, compiles the project, runs unit tests, uploads the project to the staging server.

For projects without a web interface, integration tests are run, for projects with a web interface, Selenium tests are run. The CI server generates reports and, if necessary (in case of failure at any stage of the project assembly), sends an email notification to the user.

Layout of project packages in the repository for combat servers is made by the developer manually during the release.

Hudson

The main and most interesting node in our system is the CI server. In this case, it will be Hudson as one of the most popular and common free solutions.
First install it. Hudson is available as a package, so installation is fairly simple. In addition, Hudson stores all of its configuration in files ( / var / lib / hudson ), which means it does not require integration with any database.

The Hudson architecture is built on plugin-based architecture. That is, in essence, Hudson’s job comes down to storing project / plug-in settings and building the project. In turn, the project build consists of launching in a certain order the installed plug-ins included in the project settings.
Plug-ins can be divided into several conditional groups that form the project assembly cycle, also referred to as the “pipeline” (setting up project plug-ins is available through the “Project Settings” menu):

source code management (getting / updating the project code from the repository),
assembly triggers (setting startup time for project assembly),
build environment (setting up the project build environment: JVM version)
assembly (main stage: launching plug-ins that implement the logic of assembly, integration and testing),
post-assembly operations (generation / publication of reports, notification).

Unfortunately, Hudson only allows you to change the order in which the plugins in the assembly group are executed (the order in which the remaining plugins are executed within their group is determined based on the values of the @Execution annotation in the plug-in code). Therefore, in case you need to implement your build script, for which a set of standard plugins from the Assembly group is not enough, you can go three ways:

call any external executable script that implements this script (item "Execute Shell" from the menu "Add build step"),
connect the plugin of the project assembly system (Phing, Ant, Maven) and indicate the necessary target,
write your own plugin.

By default, Hudson comes with already installed plug-ins for working with SVN and Maven. This could be quite enough if it were only for Java projects. However, suppose that we also need to work with PHP projects. In this case, it is more logical to use Phing for building the project, the plugin for which you need to install separately. This is done by going to the section "Settings / Managing Plugins / Available Updates."

Please note that some plugins require running Hudson under Java 6. You can change the path to the JVM (exactly like other configuration options) in the / etc / default / hudson file . Otherwise, all configuration parameters relating directly to the work of Hudson can be edited via a browser in the web-interface.

Regarding plug-in settings, it is also worth mentioning that the plug-in has both general settings ("Setup / System Configuration") and project settings ("Settings / Project Name / Set Up Project").

Now that all the necessary plugins have been installed, we can create a new project / task, specifying its name and settings for the corresponding plugins: the URL in the SVN repository and the command to build.

Please note that you can assemble not according to any schedule or when polling the repository when there are changes in the project, but by committing to SVN . Due to the fact that Hudson has a “Remote Access API” that allows, among other things, to initiate a project build by making a GET request , you can easily add an appropriate post-commit hook (for example, using svnlook ) for your project.

Consider the build phase:

At the moment, the package build includes retrieving data from the repository and fulfilling the goal of Phing (package build). In principle, here you can add the launch of unit-tests and deploy project to the staging-server. However, here it is worth paying attention to a few points.

Firstly, the config for the application to work on the staging server may differ from the combat config. In this case, the obvious solution is to store in the project a config for the staging server and replace it with the original project during the assembly (a separate goal for the assembly in the case of Phing or a profile for Maven ).

Secondly, installing the package on the staging server using the SCP and SSH plugins (for the plugin to work, you must make sure that the PasswordAuthentication parameter in the sshd config is set to yes, and the staging server host is added to the known hosts) because the SSH plugin relates to the project assembly stage, and the SCP plugin to post-assembly operations. Therefore, the problem of the deployment of the project to the staging server will have to be solved with the help of Phing or Maven + AntRun .
In order for our build script to perform actions on remote servers, it is necessary to generate ssh keys: leave the private key on the CI server, and scatter the public key on all servers that will interact: staging, repo, svn - adding them to the list known hosts (known_hosts). In addition, in order for Hudson to be able to install the package on a remote server, it will be necessary to start a corresponding user (hudson) on the remote server and give it sudo.

Thirdly, to successfully build java applications using Maven, you will need to define the Maven settings for the hudson user on the CI server (meaning the ~ / .m2 directory ).

The next step after installing the package on the staging server should be the launch of integration tests. They can be run on the CI server itself, however, it is preferable to do this on the staging server. In the first case, everything is quite simple: call the appropriate target Phing / Maven or configure the SeleniumHQ plugin.
However, the question remains open: what to do if you want to start the testing process on an external server - for example, contact the Selenium RC server? The answer here is very simple: Selenium RC has an HTTP API to work with it, so the most trivial solution in this case is to write a small script in any language you like, which initiates the testing process and occasionally polls the remote server for completion of the test. Further, this script is connected to the build script via the plugin "Execute Shell". I also add that the success or failure of the script is determined by Hudson based on the return code of your script.

Having set up the build process, let's not forget about the most important part of the process - notifying the developer of the build results. Hudson allows you to customize email notifications for both specific recipients and authors of commites whose changes caused a breakdown.

In addition, I recommend to anyone who will use Hudson for PHP projects to familiarize themselves with relevant articles from the Hudson Wiki .

Staging server

Package installation

And so, as mentioned above, in our infrastructure there should be a server configuration as close as possible to the combat one. On this server, Hudson will install the latest project packages built on the trunk. This will enable us to:

conduct integration testing in conditions as close to combat,
will allow you to have a kind of platform for the demonstration of the latest functionality.

One of the main problems that will have to be solved when setting up this server is the “silent” installation of packages. In order for our packages to be installed without unnecessary dialogs (that is, they could be installed using scripts or Hudson plugins), you need to reconfigure debconf ( dpkg-reconfigure debconf ), indicating to it the level of importance of the asked questions higher than those used in installation scripts for your deb-package.

In addition, dependencies are possible between packages installed on a staging server. For example, the project “client” depends on the project “server”. In this case, we must clearly make sure that the installation of the necessary server package takes place on the staging server when installing the client package.

At first glance, the obvious solution, taking into account the distribution of the project through deb-packages,
dpkg will manage dependencies by adding the “client” package control file of the server package to the dependencies .
In this case, we also have to allocate a separate debian repository, into which all the CI server packages will merge, and supplement the build script with the command to copy the package to the repository. In addition, it will be necessary to organize the mechanism for auto-updating data in this repository when adding a new package, to organize access to the repository pool (for example, by raising the web server) and to add this repository to the sources-list on the staging server. When auto-updating manually (by launching the package scanner after uploading the package to the repository) a new package can be installed via apt, in case the repository is updated according to the schedule, you have to invent tricks like dpkg -i package; apt-get -f install . Learn more about setting up your debian repository here .

However, this approach has several disadvantages. First, dependencies can only be installed on the same server. Secondly, such an approach rather significantly complicates the entire system as a whole, which contradicts the KISS principle (well, or "PIT", if in Russian: D).

In my opinion, the best solution here will be to use the repository only for interacting with the combat server. In this case, the layout of packages in the repository should not be carried out automatically, but by the decision of the developer. As for the staging server, it will install packages from the trunk of the main package and all its dependencies, which will significantly reduce the complexity of the CI system, while giving us the opportunity to have the latest current versions on the staging server
packages.

Work with DB

Our packages can use a DB. In this case, the database is also installed on the staging server, and the database structure / data is updated using the dbdeploy utility.
Integrate dbdeploy into a project in two ways:

each database is allocated a separate project in SVN and, as a result, in Hudson, with its own build script initiated by the SVN hook (the option makes sense when the database is used by several projects),
The dbdeploy file structure becomes part of the main project, and the dbdeploy database version update script call occurs in the postinst script of the package.

Separately, in this case, the question arises about the changes in the database data during testing. Clearly, when writing unit tests, we do not work with the database, but use mock objects (for example, I like the Mockito ).
However, what about integration tests that simply need to work in “real” conditions? In the case of XUnit tests, we can perform each test as part of a transaction to the database. In my opinion, this approach is more preferable, since taking into account the versioning of the database through dbdeploy, we always know what data we have in the database at the current moment and we can safely attach to them in our tests. However, in the case of testing a web interface (for example, using Selenium), we are not able to run each test as part of a transaction.
Therefore, in my opinion there are two options here: either before launching the testing of the web-interface, completely reinitialize the data into the database based on the available
patches, or build tests so that they are not tied to any specific data from the database (for example, they created the data necessary for testing via the web-interface themselves) and, if possible, did not leave “garbage” behind them.

Selenium server

In the case when the application does not have a web interface, the integration test on the staging server, as I already wrote above, may well consist in running XUnit tests. However, in the presence of a user interface, it is extremely convenient to conduct a full testing of the entire chain from HTML to DB using Selenuim.

Selenium is a powerful web-application testing system, which can be divided into two parts:

Selenuim IDE - a tool for developing and running test scripts in the browser (available as plugin firefox),
Selenium RC is a distributed system from the Selenium server and its subordinate clients, where tests are run under different browsers.

For obvious reasons, we are interested in the second option. Since the installation and configuration of Selenuim is a big topic, I don’t see any reason to touch it in this article: all the information is in the documentation .

Remarks

It should be noted that CI can also be performed manually, each time compiling and testing the code before commit. However, automation of this process using the CI server is much more expedient. In addition, it is important to understand that CI and nightly builds are not the same thing. The latter allow detecting bugs, but with a great delay, while the goal of CI is to detect errors as soon as possible. In my opinion, nightly builds can serve as a partial replacement for CI only in the case when building and testing a project is a process that takes a fairly large amount of time. In addition, if the project has both unit and integration tests, you can split the project assembly into two parts: the first (with unit tests) is run every time you commit'e, the second with integration tests - once an hour / day.

Conclusion

The solution described above works and brings profit . However, as we all know, the theory, unfortunately, far from always corresponds to practice. The implementation of the CI system required solving a number of problems, not all of which were solved perfectly.

The probability that someone will give you resources for a staging server that are comparable to the combat characteristics is extremely small - most likely it will be an average virtual machine power on a half-abandoned host machine, which fundamentally undermines one of the CI principles - testing in a similar environment . This, in turn, entails the fact that integration tests may begin to take much longer than originally planned. Therefore, in my case, “continuity” had to be compromised and I started to run tests not according to SVN hooks, but according to the schedule.

In general, if you have a certain development culture in your team (I mean the understanding that CI is not a panacea, but only a tool that, with proper handling skills, can improve the quality of their work), the introduction of CI is fully justified.

Well, probably the most important thing: as practice has shown, the integration of the CI system is a team task. To solve it will require the work of developers, testers and administrators.

Source: https://habr.com/ru/post/108928/

All Articles