We are building the development process and the CI pipeline, or How can a developer become DevOps for QA

Given:

major Java project from the front to Angular,
developed by a small team (~ 15 people),
using a heap (about 40 pieces in parallel) feature branches,
in git repositories;
several virtual servers in the private Amazon cloud that can be used for development tasks;
a developer who is a bit tired of Java and wants to do something really useful for setting up processes.

Required:

to provide an opportunity for the QA engineering team to test each feature-branch, both manually and automatically, on a dedicated stand that does not interfere with the rest.

QA ~~spacecraft~~ control console stand
')
Here you come to work in a small startup with American roots ...

It is still a small startup, but with a promising product and big plans to conquer the market.

And at first, while the development team is quite tiny (up to 10 people), the development of the code base is carried out in a common repository on GitHub Enterprise, with a quick release of small features, brane from master, and quick release cycles with merge features directly to the same master . The team lead is still able to track down who and what is commited, and each commit is not only read, but also understand whether it is correct or not. Thus, pull requests open, and are quickly merged by the developer himself, with oral approval of the lead (or rejected by him).

To ensure the integrity of the code base, the team relies on unit and integration tests, of which a couple of thousand pieces have already been written, and cover about 60% (at least for the backend). The development lead also runs a full cycle of tests on the wizard before release.

The process looks something like this:

COMMITS »TESTS MANUAL» RELEASE

It takes a couple of months. The startup shows viability, investments allow to increase the development team to 15 people. Mostly front-fighters come, and they start to quickly expand the facade, which end users see and use. The facade is being tested with fronters right on their working poppies, they write some cases on Selenium, but the development lead does not have time to drive them away before release, because Selenium is known for its slow pace.

And then two facs happen, literally one after the other.

At first, one of the backendors accidentally makes a push force into the master (the poor man caught a cold, then he didn’t understand the head), after which two weeks of work of the whole team had to be restored by a switch from a miracle of surviving local copies - all were used to yourself pull.

Then one of the large features, developed by the front end for about a couple of months in a separate branch, and green for all UI tests, after the merge at the master, it paints it sharply in red, and doesn’t bring down the work of the whole product. They shattered breaking changes in their own API. And the tests did not help them to catch. It happens. But a mess.

So, before a startup, the question of establishing a QA team, and indeed, regulations for working with feature-branches and the general development methodology, along with discipline, comes to its fullest. And it also becomes obvious that the code before the pull request should revise not only the development lead (he already has enough cases), but other colleagues as well. Normal growth problem, in general.

Here we come to the item " Given: ".

No, I never planned to become a build engineer. But after successfully demonstrating the lead development project build and running the unit tests on TeamCity, installed on the local developer server in the corner, someone had to set this up for combat mode. And I just had free time between features.

Well, let's get started.

First, we set up the TC head unit in the Amazon cloud (+ two free agents), hang them to listen to commits in the Gihabov repository (for every PR, the github makes a virtual HEAD - it’s very simple to listen to changes), and the automatic assemblies with the unit test run will go almost by themselves . As someone commits, after five minutes and the assembly in the queue becomes. Conveniently.

COMMITTES »POOL REVEST» BUILD + TESTS »RELEASE

But not enough.

At that time the githab was still a very unpleasant interface for viewing pull requests, and there was no ice to leave comments there either. It hurts too long screencloth had to scroll. That is, it was possible to take away the right to merge from team members, but to provide a normal review of the code without third-party services - in any way. In addition, I also wanted to get sane integration with Jira at the same time, so that features could be assigned to tasks, and tasks themselves should be attached to features.

Fortunately, Atlassian has a similar solution, it is called BitBucket Server, and at that time it was called Stash. It allows you to do all this integration magic with other Atlassian products, and is very convenient for reviewing the code. We decided to emigrate.

But this wonderful product, unlike the githaba, doesn’t create virtual HEADs for each PR, and after the migration, there was nothing to listen to the timiti. With the post-commit hooks, the matter also did not go because of the lack of all time to properly deal with them.

Therefore, at first, the integration of stash with TeamCity was done through a crooked crutch. Namayavshis with hooks, overseas colleague, instead of using the built-in REST API to view pull requests, desperately scribbled on bash parsing the log, which always turns around its tail -f , searches for a change of the desired view, and then already pulls REST API TC. Not the most logical approach, and some builds have begun to double, but what can you do, there is no time.

Looking ahead - when the time came, I managed to rewrite stash-watcher.sh humanly, taking changes through the regular REST, parsing the JSON response using the great and powerful utility jq , - a mega-thing for any devops who do tool integration! - and pulling TeamCity only when it really is necessary. Well, I also registered the script at the same time with the system daemon, so that it starts upon reboot itself. Amazon instances rarely need to be restarted.

Here, there are two pieces of the puzzle.

COMMITTEES »PULL REVEST» ROW OF CODE || BUILD + TESTS »RELEASE

During this time, QA engineers appeared in the team.

Poor things! For the day, switch locally between five feature branches, build and run them manually!? Yes, the enemy does not want this!

I admit honestly: I sincerely love QA engineers (especially girls). And, in general, I'm not alone. Even colleagues from NY, who originally believed in the holy tests of unit tests, turned out to love them. Only they had no idea about this when the vague task was formulated “we need to somehow investigate such a question so that we could automatically launch an application instance somewhere in the cloud for every brunch, well, so that the business could see with our eyes what exactly is there now with the feature being developed. Would you? "

“Okay,” I said (well, who else? Who once plunged into DevOps, that’s the last one), “And the“ Wanted: ”item has arrived.

An interesting task. After all, if you manage to set up an automatic deployment based on the results of the build, then you can meet the needs of both the business and our poor QA in one go. Instead of suffering locally with the build, they will go to the cloud for a ready-made copy.

Here it is also necessary to say that the application consists of several WAR containers that run under Apache Tomcat. WAR, as you know, is a regular ZIP archive with a special directory structure and a manifest inside. And when building the application, its configuration (path to the database, path to the REST endpoints of other WARs, and so on) is sewn somewhere inside the resources. And in order to feed WAR to tomkat, it is necessary to register in configs, from where to take it, by what url, and on which port to deploy it.

And if we want to launch many instances of the same WAR at once? Configure tomcat on the fly to scatter them across different ports or urls? And what to do with configs that are wired inside WAR resources?

Some kind of bad question.

So we go the other way. For example, IDEA when launching a WAR in a debugger feeds tomketu via the command line key -Dcatalina.base to the copy of the $TOMCAT_PATH/conf directory, and launches WAR not in one piece, but in exploded form, that is, unarchived, so that you can files with bytecode substitute.

Having watched what IDEA does, we try to repeat and improve this algorithm. To begin with, we start in the Amazon cloud a healthy virtual instance with hundreds of disk space (and in the exploded form our application is quite fat) and gigabytes of operatives.

Raise nginx there - because in nginx it’s quite simple to make a rule to redirect HTTP requests to #####.dev..///REST/endpoint at localhost:#####///REST/endpoint and back. ##### is already a specific port number, which is configured in Tomcat configurations. Yes, there is nothing to try to even run all the feature branches under one tomket, instead, for each of them we’ll start a separate $TOMCAT_PATH/conf directory, and run our tomcat. This is many times simpler and more reliable, and there are no problems with parallelism.

We think where to get tsiferki so that they do not match for different instances. Build number? No, in this case, QA will get confused about which feature which instance belongs to. The guitar revision number disappears for the same reason. Well, there is nothing to do, we force all developers to name branches so that they necessarily include the number of the task from Gira according to the model feature-#####-- or bugfix-#####-- . Here are the last three digits of the number and will enter the port number. And it's beautiful.

In Timity builds that build a WAR, we add an additional build step, which over SSH throws them onto that fat Amazon instance, and also over SSH pulls the bash script, doing the following:

unpacking WARs to the / deployments / d ### directory,
a copy from the / deployments / skel conf directory for tomkat,
causing the knurling of a separate instance of the database from the dump (the database dump lies in the source tree, so that it is also at hand),
with awk, sed, grep, find, and such a mother correcting tomkata configs from a copy of conf, as well as configs in unpacked WAR resources so that they have the correct ports, paths to the base, REST endpoints, and everything else.

After that, it remains only to start tomket with the -Dcatalina.base=/deployments/d### key, and that's it.

COMMITTEES »PULL REVEST» ROW OF CODE || BUILDING + TESTS "DEPLOYING THE QA BACKGROUND" QA ENGINEERS HARRING "RELEASE

So, just a minute, and our beloved QA engineers will manually go to the cloud via SSH, will they run from the command line? Somehow not great. It would be possible to automatically lift it, but it is inconvenient, since the feature brunches are already under 60, and the memory, even in the bold instance, is still not rubber. Will brake.

Think, head, buy a hat. BUT! So you can also write a console to manage your instances, if everything is in /deployments/d### . Go through the subdirectories, spit out for each link to the start / stop, for example.

nginx has already been raised, we need to configure classic CGI in it - like two bytes about the firewall. What is classic CGI? This is when an HTTP request with all headers is sent to the standard input of a binary, and some environment variables are set, and an HTTP response is taken from the standard output, also with all headers. It is also easier than steamed turnip, all this can be literally done by hand.

Hands? So don't I write a handler for the / deployments directory on bash? Because I probably can. How do I write, but how do I list.dev.. on list.dev.. (will be available only from the internal network of a startup, like all instances) ... Sometimes you want something not only useful, but also slightly abnormal. Such as the minimum HTTP request handler for bash.

So I wrote. Actually, the bash script, which with the help of awk, sed, grep, find, and such and such a mother goes over the subdirectories / deployments, and draws out where it lies in what. The build number, the guitar revision number, the feature branch name - all this garbage and so just in case already passed from TC along with the WAR nickname.

Earned with a half-kick. One drawback is to parse the input commands of the list.dev../refresh?start=d### type list.dev../refresh?start=d### with the help of regular bash and nix utilities, but it is not very convenient. But this is my own fault - I invented global slash commands and a question-action sign for instances. Yes, and external utilities were called there for 60 subdirectories hundreds of times, which made the console not work fast.

On the other hand, it is possible to determine whether a particular instance is running, from the standard ps output (the same grep for help), and you can also call, for example, netstat or mysql -e "SHOW DATABASES" without departing from the cash register, and put it in Standard output, slightly editing with a gray or avcom for readability. To diagnose very well, convenient.

And the appetite comes with eating, so that soon there are commands for killall -9 java (sometimes you want to start the week with a clean slate), uptime, and a few other useful things appear in the console. The most important is the ability to delete an instance of the application along with the database. According to the crown, of course, the / deployments directory will be cleaned in two weeks (initially it was provided), but sometimes you want to remove a PR copy of the build of the red-edged lead with PR to remove it from your eyes so that it will not be corn off.

It takes quite a bit more time, and the set of test cases grows to such an extent that QA engineers have to build up quite a lot of entities in the instance base in order to complete the full cycle of regression for a large feature. And this is not just one day. And if during this time the developer managed to commit something to the branch following the review of the code, then the base of the instance will be re-deployed after the build, causing the entities to be lost. Oops.

Add the ability to take a snapshot of the enclosed instance. It is already tied to the number of the guitar revision (there tsiferki, according to the results of the experiments, it is quite unique), and add to /deployments/s### (the other letter of the prefix, so that the copies and snapshots have different namespaces). Deploy approximately the same script as with timicity, only the base is copied not from the dump, but the existing one.

So QA engineers get the opportunity to test a specific revision until blue in the face, during which time the developer can commit as many revisions to the main branch as possible. Then, before the release, only these point changes in the main instance will be checked.

COMMITTEES »PULL REVEST» ROW OF CODE || BUILDING + TESTS »DEPLOYING QA COPY || PICTURE QA COPY »QA ENGINEERS HIS TURN» RELEASE

Wow! In just six months from the chaotic process, when the developers commit the features who are in that much, we come to a logical, coherent system of continuous integration pipeline, where every step is regulated and every tool is as automated as possible.

As soon as the developer creates a PR, the test instance deployment process has already been launched, and within an hour (if you're lucky, the number of parallel feature brunches with the growth of the team soon increased to hundreds, we had to raise seven instances under TC) for QA to test feature. Drive though manually, even with scripts through the REST API, and if necessary, diagnose it and deal with bugs using the test instance management console.

Well, after that the lyrics. After a while, the console’s brakes were boring to everyone, and I had to recall my youth by rewriting it with bash (sorry, the whole abnormality of this little project was lost at once) for plain boring PHP (however, not in Java, to do such tasks). One of the fronts was honored to remake the UI from plain-school plain HTML into a completely modern Angulyarov application. However, I insisted on saving the interface a la nineties, just for fun. Added the ability to view stdout and stderr in Tomcat. We made a special CLI interface for calling the REST API right on the spot, and also a little bit of little useful things.

Terribly handy thing turned out. *

Just look at the happy faces of the QA engineers team!

* Want this?

Write to me. I am pleased to consider job offers in places where we need experienced (more than 10 years of experience) specialists with Primary Skill == Java, and the opportunity to sometimes work out this kind of abnormal programming. Or processes to steer. You can all at once.

Only in Moscow I can not move. But to work remotely - with pleasure.

Source: https://habr.com/ru/post/330366/

All Articles

We are building the development process and the CI pipeline, or How can a developer become DevOps for QA

More articles: