Why are we sure we deployed

It often happens when something is not working. And no one wants something to work through his fault. In the context of large infrastructures and distributed applications, a configuration error can be fatal.

In the article I will show how to properly test the environment for the application, what tools to use, I will give examples of successful and expedient testing.

The article will be of interest to teams that practice DevOps or SRE , responsible Dev , and other good people.

It should be said that the concept of Infrastructure as Code can be implemented without testing. And even will work. But is there a limit to perfection?
')
Testing came to us from the world of development, testing is a whole huge epic, which includes the following options for implementation:

Unit tests (test code) - there is a function that takes one parameter as input, performs calculations and returns the result. Units - check that the result is as expected, roughly speaking, 1 == 1 .
Integration tests (testing the interconnection of components) - there is an application that needs a base to work correctly, and it has a ` users` nameplate. This test will verify that there is a base, that there is a sign that everything is ready for work .
System tests (test the application as a whole) - there is an application, it should convert the file. The test checks that the application converts , and the result suits us.
Smoke tests (imitate user behavior) - tests what a user can open a website in a browser, log in , upload an image, and convert it.

And here a false realization may come that in our world Ansible / Chef / Puppet / Salt everything is very primitive, everything is very simple, and there is no realization of something similar in the ecosystem of our tools.

But somehow, not so.

All this is, all this has been used for a long time and successfully.

What to test

I will clarify right away that we will not consider high-level programming for Configuration management . For example, if you wrote your Ansible module, or Chef LWRP - your knowledge is enough to understand how to properly test your creation.

In fact, all the testing frameworks are very similar, differ in a set of ready-made test cases and syntax.

Everywhere we will solve one problem for an example: we must make sure that the package ' apache2 ' is installed and the service ' apache2 ' is up and running. In the end, we check that someone is listening to port 80.

ServerSpec

The first testing framework we’ll look at is Serverspec .

Usage example:

require 'spec_helper' describe package('apache2') do it { should be_installed } end describe service('apache2') do it { should be_enabled } it { should be_running } end describe port(80) do it { should be_listening } end

InSpec

The second framework is InSpec .
The syntax is identical to Serverspec .

Testinfra

The third framework is Testinfra .

 def test_apache_is_installed(Package): apache = Package("apache2") assert apache.is_installed def test_apache_running_and_enabled(Service): apache = Service("apache2") assert apache.is_running assert apache.is_enabled def test_apache_port(Socket): sock = Socket("tcp://80") assert sock.is_listening

Goss

The fourth framework is Goss .

 package: apache2: installed: true service: apache2: enabled: true running: true port: tcp:80: listening: true

Serverspec / InSpec / Testinfra / Goss?

As can be seen from the syntax - there is plenty to choose from. Who likes more, it is best to use. For example, if you have Chef , InSpec or Serverspec will be great . If Ansible or Salt , Testinfra or Goss is cool.

All the differences are the presence or absence of ready-made modules, the first thing is to prepare a list of necessary requirements for the framework and look for the one that has everything you need (or almost everything).

Okay, chose. What exactly is testing?

What to test

A few weeks ago, we talked in our DevOps Community about what needs to be tested, whether it is necessary to test something at all, how many years to fly to Trappist-1, and so on.

For example, ctrlok thinks that testing declarative things is not correct. And I agree with him.

I’ll make sure you don’t need to test:

Different declarative things (package availability, package version, file presence) - your Configuration management will take care of all this.

What you need to test:

Things that once / recently broke
Things that could potentially break
Things that someone (a colleague ) can break because of ignorance
Things without which nothing will work exactly

We have a lot of applications in the context of a single product, and they all require fine-tuning the environment for normal operation. One of the most disturbing and bottlenecks was an application that converts various document formats. It is not difficult to guess that it uses dozens of packages and hundreds of libraries, and all this can suddenly stop working if any package on which it depends is updated.

In order to properly evaluate the application and prepare it for testing - you need to think like a tester. Or, for example, consult the QA team.

During communication, you need to think through the main test cases and try to cover all of them.

Practical example (in our application):

Convert from pdf to html
Convert from html to pdf
Convert from doc / docx to pdf

In our case, individual items are separate modules of the application, this is how we will test them.

Let's test!

I will show a small example of integration-system tests of the first module of the application. I ’ll clarify that we have a role that should correctly install the console utility pdf2htmlex , where all dependencies and necessary things are clearly described, i.e. we mean that the deployment process of the environment was successful, and the stage of testing the infrastructure has arrived.

For a start, I propose to make sure that the binary is available, you can run it, and get something as a result:

 # Validate binary exists and working describe bash('pdf2htmlEX --version') do its('stderr') { should match /pdf2htmlEX version 0.14.6/ } its('exit_status') { should eq 0 } end

Now let's test that this binary really knows how to convert a test document, and the result suits us:

 # Try to convert and validate result html describe bash('pdf2htmlEX /opt/test/pdf-sample.pdf /tmp/pdf-sample.html') do its('stderr') { should match /Working/ } its('exit_status') { should eq 0 } end describe file('/tmp/pdf-sample.html') do it { should exist } its('content') { should match /<!DOCTYPE html>/} its('size') { should eq 267189 } end

In this case, if the test is successful - we can clearly declare that the application module that is responsible for converting from PDF to HTML works. Moreover, it works correctly, we are satisfied with the result of his work.

Is it difficult and long to do such a test? I do not think.

How much does it save business money? Each has its own cost of error.

Benefits

It should be said that we did not start testing the environment right away. Why waste time on such things - is it obvious that we are all supermen, and write the code for Configuration Management immediately without errors?

And this approach works when the DevOps team consists of 1-2 people. In this case, every engineer knows how the application should work, knows how to prepare the environment for him, and how to “poke” him - to see that it will work on the site he has just prepared.

Everything ends when everything should work in the production environment, but does not work. The debug process begins with hands on the prode or kickbacks, and every minute is very expensive for the business.

That is why infrastructure testing is an investment in the future .
Now, when something breaks down - we are absolutely sure that our part of the work has been done 100%, and all questions are closed from our side .
Therefore, we are confident that we have just deployed.

findings

We all work with applications that change very often in a very dynamic world thanks to our properly configured processes - CI and CD. We are all very serious about the work of the applications themselves, testing the same thing from all sides. And at the same time, it is not very reasonable to treat the infrastructure on which this application works is negligent. Moreover, it is fraught with great consequences.

For ourselves, we decided that we would test in such cases:

if the system module has complex logic (complexity)
if the module already now breaks down frequently (availability)
if you need to transfer specific knowledge (communication)
if the error costs us more than X money (business value)

The main message was to convey the importance of infrastructure testing and encourage the use of these tools if not right now, then in the near future.

And how many percent of your infrastructure is covered by tests?

PS If the information was useful to you, and you want to develop in this direction - subscribe to my personal telegram channel: https://goo.gl/1MnG9v
You can always unsubscribe. What if you like it?

Source: https://habr.com/ru/post/323472/

All Articles

Why are we sure we deployed

What to test

ServerSpec

InSpec

Testinfra

Goss

Serverspec / InSpec / Testinfra / Goss?

What to test

Let's test!

Benefits

findings

More articles: