📜 ⬆️ ⬇️

SMS notifications from nagios by means of clickatell.com and site monitor on bash

image

Good afternoon comrades


Like everyone who deals with remote systems, I needed to monitor a lot of machines and services on them. Scrolling through the descriptions and manuals of several programs, stopped at Nagios. Many articles and examples on it and a very rich setting turned out to be what you need. And so I decided to share a few points in the implementation and configuration of the samopisny plug-in written in Bash and the sms notification system using clickatell.com.

Why clickatell.com? Yes, it's just that he is already used in the organization, and I decided to use it for this action.
How to configure nagios there are a lot of articles, and I will not focus on this, I will go straight to the description of what I have written and how it works.

Message sending service


Having looked at the API of clickatell.com itself, it became much clear and understood how everything is implemented there, and since I know only bash decided to write on it.
')
Bash code send sms
#! / bin / bash
text1 = $ 1
nombertel = $ 2
wget --post-data "api_id = 111111 & user = User1 & password = pass1" api.clickatell.com/http/auth --quiet -O -> / tmp / sms_tmp_id
session_id = $ (cut -f2 -d "" / tmp / sms_tmp_id)
wget --post-data "session_id = $ session_id & to = $ nombertel & text = $ text1" api.clickatell.com/http/sendmsg --quiet -O - rm / tmp / sms_tmp_id
exit 0


The first wget - post-data we send data to generate the session id, it consists of api_id is the id of the service used when sending SMS, user login username to clickatell.com and password, respectively the password. In APi clickatell.com there is a check on session_id, but I went a simple way, by simply generating a new one. Alas, clickatell.com API can be viewed only after registering on their website.

The second wget - post-data, we are already sending some text to the specified number. session_id is the session identifier, to the phone number to which the message is sent and the text of the message text. The check showed that everything works, but there is of course but, this is the maximum length of the message 70 characters. Therefore, we transmit only the most important, in contrast to the mailing service by mail.

Here is the service itself responsible for sending messages from from nagios (one is responsible for the hosts of the other for services)

nagios sms-host, sms-services
define command {
command_name sms-host
command_line /usr/bin/sms_get_number.sh "** $ NOTIFICATIONTYPE $ Host Alert: $ HOSTNAME $ is $ HOSTSTATE $ **" $ CONTACTADDRESS1 $
}
define command {
command_name sms-services
command_line /usr/bin/sms_get_number.sh "$ SERVICEDESC $ Host: $ HOSTALIAS $ $ SERVICEOUTPUT $" $ CONTACTADDRESS1 $
}


And this is the user's config to whom to receive notifications

nagios contact_name sms
define contact {
contact_name mobile
alias mobile
noworktime service_notification_period
host_notification_period noworktime
service_notification_options w, u, c
host_notification_options d
service_notification_commands sms-services # Names of the service described above
host_notification_commands sms-host # Names of the service described above
address1 +79111111111
}


With this setting, we will receive notifications about the fall of services and hosts during the time period described as noworktime.

But if anyone is interested in the noworktime config, this is non-working time and a weekend, so that the phone would not mess up with messages once again, because on a working day, everything will be sent to the post office.

nagios timeperiod
define timeperiod {
timeperiod_ noworktime
alias no work region SPB
sunday 00: 00-24: 00
monday 00: 00-05: 30,14: 30-24: 00
tuesday 00: 00-05: 30,14: 30-24: 00
wednesday 00: 00-05: 30: 14: 30-24: 00
thursday 00: 00-05: 30,14: 30-24: 00
friday 00: 00-05: 30,14: 30-24: 00
saturday 00: 00-24: 00
}
# Time on the server is offset by a couple of hours from this and such numbers.


And yes, I almost forgot. I also wanted to throw out the script for all to see, which sends an error from a certain resource.

Web site error monitoring service


This service was needed for a quick response to the situation with the fall of the site. In nagios there is a web server diagnostic service, but it does not suit me since it diagnoses the entire server and does not return error codes. And on one server can be a lot of resources, so we write our own.

In general, here is the web resource diagnostic script itself /usr/bin/check_http_error.sh
Config specific site (you can simply ip address)

script check http


#! / bin / bash
site1 = $ 1
exitservice = 3
WGET = `wget $ site1 -O / tmp / check_site_error 2> / tmp / error_service.tmp`
if [$? -ne 0]
then
outmes = `grep -o" ERROR. * "/ tmp / error_service.tmp`
echo "$ outmes $ site1"
exitservice = 2
else
outmes = "OK $ site1"
echo $ outmes
exitservice = 0
fi
rm /tmp/error_service.tmp
rm / tmp / check_site_error
exit $ exitservice


And this is the nagios config using the script /usr/bin/check_http_error.sh

nagios check http
define command {
command_name check_site
command_line /usr/bin/check_http_error.sh $ ARG1 $
}


Yes, that's the description of the service itself.
Alas, each service has to be described separately because of its specifics and brought to localhost.

nagios service
define service {
name check_site
use base-service
service_description check_site
check_command check_site! http: //mai.ru
host_name localhost
}


Well, in general, and everything, if there is any problem with the site, you will receive an error code and on which resource. It helps me to determine in due time DDoS attacks and invasions.

And yes, if it is of interest, I can still publish an article about the service for nagios, which determines free space by means of snmp in percentage ratio.

I have everything, thank you for your attention.

Source: https://habr.com/ru/post/162091/


All Articles