We in the Atlas love when everything is under control. This also applies to the entire server infrastructure, which, over the years, has evolved into a living organism from numerous virtual machines, services and services. There was a need to monitor the vital aspects of the IT component of our activity: monitor the combat server, track changes in system resources on virtual databases, monitor the progress of business processes, and so on. The question arose - how to achieve this and most importantly with what tools? Began to look for some ready-made solutions. We tried a bunch of paid / free services that would supposedly provide us with the "most valuable" information about the state of our system. But, ultimately, it all came down to some incomprehensible diagrams, diagrams and figures, which, in fact, had no value for us.
So we came to understand that you need to collect something yourself. We decided to take the most flexible and advanced system as a basis, which can be configured to monitor anything and how you like - Nagios. Set up, set, works - cool! It’s a pity that the interface of this miracle was stuck somewhere in the mid-90s, but we wanted the visual component to be on par.
A short search showed that Grafana is the leader among the solutions for creating beautiful dashboards. So we decided to display all of our monitoring from Nagios on monitors in the form of beautiful graphs in Grafana. The question remained only - how to make friends with each other?
Monitor the entire infrastructure through Nagios , set up alerts on problems in the system via Slack , connect the output of system performance data to the graphical shell of Grafana for real-time monitoring.
Nagios collects statistical data from various virtual machines of the whole system. We need to save this data to the database in a certain format and at a certain interval so that Grafana can output it. Grafana works with several formats, but the most convenient for us is Graphite . Graphite is essentially the same graphical shell, but its interface was apparently made by the same people as the Nagios interface. Under the hood, he has a database that stores a stat. data - Whisper and interlayer for processing this data - Carbon . Nagios does not know how to communicate with the Graphite directly, so smart people created an add. A plugin that takes the current readings from Nagios and sends them to Carbon - this plugin is called Graphios . Thus, our task is to tie together 6 different technologies. Go!
Immediately a small disclaimer:
Install and configure Carbon:
apt-get install graphite-carbon sudo nano /etc/default/graphite-carbon
Set the parameter value to true:
CARBON_CACHE_ENABLED=true
Save, exit.
sudo nano /etc/carbon/storage-schemas.conf
This file contains directives that specify the storage options for the stat. Data: how often it is stored and how long it is stored. For ourselves, we use approximately the following directive:
[atlas] pattern = .* retentions = 60s:1y
This means that the data will be sent to the database every minute and stored for a year. Adjust the values to fit your needs.
Also, it is important to understand that the frequency of storing data in the database should not exceed the frequency of data output by Nagios itself - otherwise we will add duplicate values to the database. Out of the box, Nagios listens to all services and hosts once every 10 minutes, so if you want to achieve maximum real-time, you also need to change the processing intervals on the Nagios side.
We connect the last config and start Carbon:
sudo cp /usr/share/doc/graphite-carbon/examples/storage-aggregation.conf.example /etc/carbon/storage-aggregation.conf sudo service carbon-cache start
We are preparing a base for all further programs. We prefer PostgreSQL, but Graphite supports different databases.
apt-get install postgresql libpq-dev python-psycopg2 sudo -u postgres psql
We configure the new user and base:
CREATE USER graphite WITH PASSWORD 'password'; CREATE DATABASE graphite WITH OWNER graphite; \q
The database password must be saved - it will still be useful to us.
Install Python , Django, and then graphios itself:
apt-get install -y python2.6 python-pip python-cairo python-django python-django-tagging pip install graphios
Editing the /etc/graphios/graphios.cfg file:
debug = False enable_carbon = True
Create a folder for storing statistical uploads:
mkdir /var/spool/nagios/graphios/ chown -R nagios:nagios /var/spool/nagios
Add a test string to the definition of the Nagios service:
define service { use generic-service host_name DB service_description PING check_command check_ping!100.0,20%!500.0,60% _graphiteprefix monitoring.nagios01.pingto }
Call Graphios in test mode:
/usr/local/bin/graphios.py --spool-directory /var/spool/nagios/graphios --log-file /tmp/graphios.log --backend carbon --server 127.0.0.1:2004 --test
At the output should appear records like:
monitoring.nagios01.pingto.DB.rta 0.248000 1461427743 monitoring.nagios01.pingto.DB.pl 0 1461427743
If everything is OK, run the graphios daemon:
service graphios start
Graphite must be set strictly after installing Carbon, otherwise Nagios / Graphios will not be able to send data correctly.
apt-get install -y libapache2-mod-wsgi python-twisted python-memcache python-pysqlite2 python-simplejson pip install whisper pip install carbon pip install graphite-web pip install pytz pip install pyparsing wget https://raw.github.com/tmm1/graphite/master/examples/example-graphite-vhost.conf -O /etc/apache2/sites-available/graphite
Next, you need to slightly fix the new Apache2 config:
nano /etc/apache2/sites-available/graphite
Change "WSGISocketPrefix / etc / httpd / wsgi /" to:
WSGISocketPrefix /var/run/apache2/wsgi
Add another alias after the line "Alias / content / / opt / graphite / webapp / content /":
Alias /static/ "/opt/graphite/static/"
Save, exit.
cd /opt/graphite/webapp/graphite cp local_settings.py.example local_settings.py nano local_settings.py
In the file that opens, we include the lines and enter the values:
SECRET_KEY needs to be invented, and the values for the DATABASE directive are taken from the previously created base.
The WHISPER_DIR value can be found via the " locate whisper " command.
Values of the CARBONLINK_HOSTS directive must be entered in accordance with the output of the " lsof -i -P | grep carbon " command .
SECRET_KEY = 'some_secret_key' TIME_ZONE = 'Europe/Moscow' WHISPER_DIR = '/var/lib/graphite/whisper' USE_REMOTE_USER_AUTHENTICATION = True DATABASES = { 'default': { 'NAME': 'graphite', 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'USER': 'graphite', 'PASSWORD': 'password', 'HOST': '127.0.0.1', 'PORT': '' } } CARBONLINK_HOSTS = ["127.0.0.1:2003","127.0.0.1:2004","127.0.0.1:7002"]
During the configuration process, the system will ask for a super user. It is necessary to put down new values and remember them.
cd /opt/graphite/conf/ cp graphite.wsgi.example graphite.wsgi cd /opt/graphite/webapp/graphite python manage.py syncdb chown -R www-data:www-data /opt/graphite/storage/ a2enmod wsgi a2ensite graphite python manage.py collectstatic --pythonpath=/opt/graphite/webapp chown -R www-data:www-data /opt/graphite/static /etc/init.d/apache2 restart
The simplest part - if Graphite / Carbon is configured correctly - it will be enough to plug in a new resource like Graphite and set up a dashboard for data output - Grafana will do the rest!
wget https://grafanarel.s3.amazonaws.com/builds/grafana_3.0.0-beta51460725904_amd64.deb sudo apt-get install -y adduser libfontconfig sudo dpkg -i grafana_3.0.0-beta51460725904_amd64.deb sudo service grafana-server start sudo update-rc.d grafana-server defaults 95 10
The interface will be available on port 3000 . Default login / password - admin .
As an alternative to direct visualization and passive letters, let's also include the output of alerts from Nagios to Slack.
1) Create a new channel in Slack, for example #alerts
2) Go to the Slack application page
3) Find the application Nagios
4) Follow the instructions for loading the config file.
wget https://raw.github.com/tinyspeck/services-examples/master/nagios.pl cp nagios.pl /usr/local/bin/slack_nagios.pl chmod 755 /usr/local/bin/slack_nagios.pl
5) Copy the token and the Slack domain and paste them into the new config /usr/local/bin/slack_nagios.pl
6) Copy Nagios directives and paste into appropriate places (commands and new contact)
define contactgroup { contactgroup_name admins alias Nagios Administrators members root,slack } define contact { contact_name slack alias Slack service_notification_period 24x7 host_notification_period 24x7 service_notification_options w,u,c,r host_notification_options d,r service_notification_commands notify-service-by-slack host_notification_commands notify-host-by-slack } define command { command_name notify-service-by-slack command_line /usr/local/bin/slack_nagios.pl -field slack_channel=#alerts -field HOSTALIAS="$HOSTADDRESS$" -field SERVICEDESC="$SERVICEDESC$" -field SERVICESTATE="$SERVICESTATE$" -field SERVICEOUTPUT="$SERVICEOUTPUT$ ($LONGDATETIME$)" -field NOTIFICATIONTYPE="$NOTIFICATIONTYPE$" } define command { command_name notify-host-by-slack command_line /usr/local/bin/slack_nagios.pl -field slack_channel=#alerts -field HOSTALIAS="$HOSTADDRESS$" -field HOSTSTATE="$HOSTSTATE$" -field HOSTOUTPUT="$HOSTOUTPUT$ ($LONGDATETIME$)" -field NOTIFICATIONTYPE="$NOTIFICATIONTYPE$" }
7) Save, reload Nagios, check.
» How To Configure StatsD to Collect Arbitrary Stats for Graphite on Ubuntu 14.04
» How To Install and Use Graphic on an Ubuntu 14.04 Server
» Https://github.com/shawn-sterling/graphios
" Http://grafana.org/features/#graphite
Source: https://habr.com/ru/post/308750/
All Articles