📜 ⬆️ ⬇️

Zabbix for DevOps: how we implemented the monitoring system in the development and testing processes



We continue the cycle of publications on the implementation of DevOps tools and practices in our company. Recently, we talked about how we analyze vulnerabilities using neural networks and fuzzy logic , and today we will talk about the implementation of the Zabbix monitoring system in the development and testing processes.

Typical testing problems


One of the serious problems that we wanted to solve with the help of the implementation of DevOps practices was the organization of monitoring of IT systems. Prior to that, it was not actively implemented. This led to a number of difficulties. For example, we could lately notice that the place on the disk of the necessary machine ran out, problems were signaled by interruptions in the work of services and applications. All these are acute problems for the development and testing process, since due to such errors the product may be unstable, and it becomes more difficult to catch bugs in it.
')
When testing a product, it is necessary to analyze a variety of indicators: logs, server load, availability of internal and external services, etc. In addition, from version to version, the list of observed parameters may vary.

To solve all these problems, we decided to integrate the Zabbix monitoring system into the development and testing processes. That's how we did it, and what happened in the end.

How we implemented Zabbix


In general, monitoring systems are divided by function and type. The division by function can be described as:


Monitoring by type is divided into:


The mechanism we use allows us to “close” all these functions and carry out system monitoring in full, and application monitoring in part.

To optimize the testing process, we decided to delimit the areas of responsibility between the DevOps team, developers and testers, wrote our own tools for interacting with Zabbix (zabbixtools) and switched to using the “monitoring as code” approach.

The division of responsibility looks like this - if we have a host that needs to be monitored, then the tester assigns him a certain role. Each host can have only one role in which there can be several profiles - they are provided by the DevOps team. Profiles can be different - for monitoring processes, services, API, etc.



The zabbixtools package includes Python and Powershell scripts ( project repository on GitHub ), which extend the functionality of the Zabbix system in the following parts:


The monitoring system should be flexible and easily configurable, which cannot be said about the Zabbix web interface. Therefore, to simplify the configuration, it was decided to implement a system in which Zabbix had to take as much data as possible about the observed indicators from the target server. To simplify setup, zabbixtools uses Low Level Discovery (LLD) functionality.

This allows you to make any monitoring settings on the monitored server itself, and not in the Zabbix admin panel.

The approximate scheme of the implementation of this function is presented in the diagram:



Here's how it works:


Thus, to set up a new host with certain rules, the teams do not need to learn the administrative interface; it is enough just to add the necessary set of profiles to the role that will be associated with this server. The web interface will only be used to view current status / historical information.

An example of embedding a monitoring system in the development / testing processes
The problem of monitoring during the development and testing of a product is a list of observed parameters that varies from version to version.

For example, it is possible that for two releases we need to monitor the following indicators:

1. Release N.0


2. Release N+1.0


And today, at the SERVER-1 stand, one version of the product will be deployed, and tomorrow - the other should be deployed. In monitoring systems, any changes are made manually in the web interface with static preset parameters.

zabbixtools with LLD allows you to solve this problem in two possible ways:

1. The monitoring configuration (yaml files) is deployed when the product is deployed (files are taken from the same repository as the source code of the product).


2. The monitoring configuration (yaml files) is deployed when testing the product (files are taken from the same repository as the code for the test).




This scheme allows you to customize monitoring depending on the product release and keep it under version control. We can also implement SMOKE testing using the monitoring system itself. In addition, this configuration allows for the division of responsibilities between the DevOps, development and testing teams - in particular, programmers and testers do not need to deal with the work of Zabbix and its API.

As a result, the participation of different teams is required only at certain intervals. For example, when deploying a product, developers are responsible for maintaining the relevance of the monitoring configuration, the monitoring configuration is different from branch to branch and from feature to feature, and SMOKE testing is performed during deployment even without running tests.

When testing a product, testers are responsible for maintaining the relevance of the monitoring configuration, SMOKE tests can be performed by the monitoring system.

Work examples


Below are sample configurations for monitoring sites, services, processes and folder sizes.

Consider a situation in which we need to monitor the availability of internal and external sites - from them we should receive an HTTP 200 OK response. To do this, simply add the lines to the websites.yaml file. If it is required to monitor the work of services (services should be included), they should be added to services.yaml .



In case you need to monitor the memory consumption of certain processes, you can do this with a few lines in process.yaml.

Another useful feature of this tool is the ability to set different threshold values ​​for metrics. If, for example, it is important for us that the size of the Cache folder is small, and the logs can grow in size, then you can simply add different CriticalSize folders in foldersize.yaml .



PS The story about our experience of integrating monitoring with the help of Zabbix into the development and testing processes was presented as part of the DevOps mitap, which took place in the fall of 2016 in Moscow.

Video:



Slides



The link presents presentations of 16 reports presented during the event. All presentations and video presentations are added to the table at the end of this topic .

Author : Alexey Burov

Source: https://habr.com/ru/post/325276/


All Articles