I somehow got up in front of me and my colleague a simple, in essence, task - to put under monitoring Zabbix with about half a thousand identical terminals running Linux, scattered throughout the country. The terminals belonged to the same network - 10.0.0.0/8. It would seem that the problem is completely trivial. In fact, to bungle a template, start auto-detection and all the found hosts automatically add to the group and roll this template on them. Simply tea from the bag brew. Rolling up our sleeves, we got down to business ...
Problem birth
In general, nothing happened (in full accordance with the laws of the genre). We made a template, started autodiscovery and went to a cultural vacation, because it was on Friday. Returning on Monday, we discovered that over the weekend, our Zabbix “autodetect” is just one host. Having grieved to this, we began to reflect on the reasons. And then a simple thought struck us - what is the class A subnet, in fact?
Calculator helpfully podkazal that the class A subnetwork contains almost 17 million hosts. Further, our electronic accountant clarified that if Zabbix spends 3 seconds on disassemblies with a single address (lags in the network, the survey is obviously not our customers, etc.), then it will take him something around a year and a half to bypass the class A subnet. On one bypass, I emphasize. This process can be carried out in parallel - for example, checking addresses in batches of 20 pieces each. But even in this case, one round of the class A subnet will take about a month. It is obvious that one bypass is not enough - our terminal can be turned off or unavailable at the very moment when Zabbix tries to interrogate it, which means that, for loyalty, it will be necessary to go through the network three or four times more. In general, we suddenly realized that regular Zabbix autodetection tools work in the normal mode, but they are absolutely not designed for examining large networks.
What to do? © Chernyshevsky
Add hosts in Zabbix by hand? This meant recognizing shameful surrender in the face of a simple, in essence, problem.
')
The Internet prompted us that we were not the first to encounter such a task.
At the official forum Zabbix asked a similar question. And they got the traditional, alas, answer for this kind of forums - the proposal of the most complex and inefficient way. Since there was no time to study the Zabbix API, we decided to try our luck again with the database.
Algorithm of the decision
The Zabbix database has a table that lists all monitored hosts with their parameters - the hosts table. In addition, each host in this table corresponds to a unique identifier - an integer. At the same time, there are not so many parameters for setting the host to control - its identifier in the database, its IP and its name in Zabbix.
The approach is simple - first, we find out which index on the database has the most recently added host:
select hostid from zabbix.hosts order by hostid desc limit 1;
Hereinafter, queries are given for MySQL, which, in principle, does not affect the essence. Increase the resulting value by one - ready index on the database for the new host. We know the host name and its IP address, respectively, we can add a host in Zabbix:
insert into zabbix.hosts (hostid,host,useip,ip) values ($hostid,$hostname,1,$IP);
Wherein:
- $ hostid - a new database index obtained by us earlier;
- $ hostname is the name of the host;
- $ IP is his IP address.
The above query was made under the assumption that we control the hosts by their address, and not by their DNS name. However, for this second option, the request will not fundamentally change - only a new field will be added.
So, the host is added, but this is not enough - each host must belong to some group. And this can also be done directly in the database. Suppose for definiteness that we want to push our terminals into the Linux servers group.
All groups in the database are stored in the groups table (trivial, isn't it?). We will need our group index, which we can find out with a simple query:
select groupid from zabbix.groups where name='Linux servers';
To account for the accessory of hosts to groups in the database there is a special table - hosts_groups. Simple correspondences are stored in this table - a group with an index such and such corresponds to a group with an index such and such. There are exactly as many matches for each host as there are groups in this host. And each match, as usual, is assigned a unique index - an integer. Find out the index of the last match:
select hostgroupid from zabbix.hosts_groups order by hostgroupid desc limit 1;
Increasing the resulting value by one - this is the new matching index for our newly added host. We know the host index, the group index was found out - what prevents us from adding a host to the group?
insert into zabbix.hosts_groups (hostgroupid,hostid,groupid) values ($hostgroupid,$hostid,$groupid);
Wherein:
- $ hostgroupid - new mapping index;
- $ hostid - index of our new host (we used it above to add the host itself);
- $ groupid - the index of the group we need (we figured it out above).
After all these manipulations, we can look at the Zabbix web interface and find our host in the Linux servers group. Looking at its parameters, we see that sensors are not associated with a given host, triggers and graphics - there is a host, Zabbix knows about it but does not control it, because no data item has been specified. We apply to this host a pre-created template and - it's in the bag! Zabbix independently puts sensors and triggers on our host and starts monitoring.
As a result, our problem is solved by a combined approach:
- database manipulations add all our terminals to Zabbix;
- using the mass update host function, we attach the corresponding template to the added terminals.
PS Around in the article the name of the host means the very name by which this host appears in Zabbix. And it is not related to the DNS-name of the host (although no one bothers them to match). If you are using a monitoring agent, then the host name is the value of the hostname in its configuration file.
Instead of epilogue
The failure of previous generations to manipulate the database, quite obviously, lies in the fact that matching a template to a host in a database is not the same as applying this template from the Zabbix interface. Obviously, the application of the template creates mappings between the host and sensors, the host and triggers, etc., and the structure of these tables is already becoming too complex.
I do not bring here a ready script. First, because I don’t have it - the practical part was done by my Perl colleague by parsing a text file with names and IP terminals and creating an SQL script from them. I would do everything in Ruby, most likely. Another colleague of mine said that an ordinary shell script would suffice here with ears. However, the algorithm is simple and can be implemented as you please.
And finally ... When we added our terminals, we did not take into account one thing - for Zabbix, the number of monitored sensors suddenly increased from 80 to 4000. The poor animal was stupid from such a turn of events and collapsed. Therefore, I recommend to disable Zabbix hosts before such a massive addition, at least for reasons of humanity. And, of course, after adding and applying the Zabbix template, it will take some time to realize and digest the heap of objects under its control, so be patient. However, this time is by every order less than the time required to bypass the class A subnet.