NetXMS as a monitoring system for the lazy ... and a bit of comparison with Zabbix

0. Intro

I did not find any articles on NetXMS on Habré, although I was looking for it. And only for this reason I decided to write this creation in order to pay attention to this system.

This is a tutorial, and how to, and a superficial overview of the capabilities of the system.
')
This article contains a superficial analysis and description of the system capabilities. I did not dig deep into opportunities for a number of reasons. And the description of all the functionality would take far more than one article.

Since more or less I worked only with Zabbix, the article will often refer to this system for comparison with the subject. Besides, it somehow happened that everything is compared with something generally accepted.

1. What and why?

The monitoring system, as can be seen from the definition, is a system that allows at any time to obtain relevant information on any network node \ machine \ router \ enter the necessary information.

Why this is needed is just as obvious. To be aware of what is happening.

Often, the monitoring system can provide very comprehensive information about the state of the infrastructure as a whole, and its individual parts. Plus, if you set up alerts (and without it is worthless any monitoring), then at the output we get a serious tool that allows us not only to respond to emergencies in a timely manner, but, in most cases, makes it possible to prevent these accidents.

2. Why NetXMS?

A bit offtopic for those who love mnogabukaff

When I was young and inexperienced ... So usually all fairy tales begin? .. In short, when I was a maximalist, I studied the issue of monitoring infrastructure, and the result of those “studies” was the conclusion that there is nothing better than Zabbix. And at that moment I really liked him, because he can do a lot, he can do a lot, and his functionality can be expanded with handwritten scripts, so one can say that he can do anything if his hands grow from the right place. But Zabbix is a rather complicated tool. In order to somehow deal with him, it took me almost two weeks. And this is just a stage of implementation. Then another couple of weeks to set up monitoring everything I needed then. And how can I use it to watch the state of the ports on the routers I read, read, scared, and decided that this is not really necessary for me ... No, well, it’s really scary for an unprepared mind right after the institute ...

And now, a few years later, having matured and matured, I decided that it was time to refresh my knowledge about monitoring systems. And finally, to make a normal, stable monitoring of the network, since this problem has become urgent due to the growth of the segments, connection of the branch and other trifles. Everything is flowing, everything is changing. And I began to test all sorts of tools. Then I will put this system, touch it, then that one. And everywhere something is missing, everywhere something is not pleasant. That interface is inconvenient, then you will tune up, then something else. And then I turned up NetXMS. By the way: where with Zabbix I understood for two weeks, with NetXMS everything went and went in a couple of hours. Probably, the experience with Zabbix affected, and maybe the thing about NetXMS itself ... it's hard to say.

NetXMS was pretty easy to install and configure. It has server and administration console versions for both Windows and Linux, graphing, a built-in “browser” that allows you to connect to the same http / https routers from within the administration console, and the interface is intuitive, as well as an agent for any OS. In short, this system was simple and convenient. At first sight.

Documentation they have, by the way, on 8 out of 10. The main things in it are indicated, but in order to understand the subtleties, you need to dig deeper.

The main advantages indicated by the developer

Deep network monitoring
Automatic detection, visualization and search for connected components at level 2 and 3
Full SNMPv3 support
Active detection with scanning probes
Passive detection based on information from monitored devices - ARP and routing tables, interfaces
Monitoring applications and servers
All basic metrics you expect: CPU, file systems, I / O, memory, traffic
JMX Bridge for Monitoring Java Applications
Application-specific extensions: Oracle, MySQL, PostgreSQL, MongoDB, DB2, Tuxedo, and many others.
Integration API for custom applications
Designed for large networks
One server can track hundreds of metrics across thousands of devices.
Full support for distributed monitoring and horizontal scaling
Monitoring Overlapping IP Subnets
Flexible access control for operators and customers

3. Deepen

How to install the system - there is a whole admin manual available on the developer’s website , and there’s nothing complicated about it, so I’ll not stop here. The most difficult thing to do is to connect the system to the database ... The only difference is in the choice of the DBMS and OS of the server. I chose Windows (hereinafter, enter data-ready) and MSSQL, since I already had it.

3.1. Enable Autodiscover

Autodetect nodes can be enabled during installation. If this was not done for some reason, then it can always be included in “Configuration-Network Discovery” or by clicking on the icon with the same name. The discovery properties appear in the workspace. It is necessary to turn it on, transferring to passive or active mode, and designate other properties, for example, the polled subnet. Also here you can wind the filters, and you can both choose and tune the built-in system, and write your own scripts. Honestly, I didn't need filters. But the setting looks pretty friendly and understandable.

You can also specify the default SNMP community and its credentials (read authorization).

3.2. Adding Nodes

When autodiscovery is enabled, end nodes (PCs, printers) will be added and distributed automatically across subnets. With active network equipment, things are a bit more complicated.

You should start by turning on SNMP network equipment and setting up the community. If the default value for the SNMP community did not change in the properties of autodetecting hosts, then the system uses the public community, and this setting can be changed for each node in the properties of the node being added (node).

For convenience, you can group devices according to any convenient principle; for this, you can create containers in the tree Infrastructure Services branch and bin them in the nodes from the Entrie Network.

After enabling and configuring SNMP on the routers and adding them to the list of monitored equipment, the system will contact them, interrogate, determine what the devices are, apply the templates, start collecting data and show the front pannel for each router. Herself. From the box. When I set up Zabbix, I could only dream of such a thing.

I added all my network glands manually, because for some reason they themselves did not show up for me. It's simple. Create a container in Infrastructure Services with any convenient name (for example, switches) and right-click to create a node , specify the display name of the node, its IP-address and better check the "disable NetXMS agent use" box, since it cannot be set to network equipment, and unnecessary packages to anything.

Built-in browser. You can just RMB on the desired router and open the device web-interface. It seemed to me more convenient than every time to climb into your records and look for the right IP. True, the browser is so-so.

3.3. Network mapping

First of all, you need to create a network map as an object. As usual - in the Networks Map PCM branch , create, select a map type, set parameters , and the system will do everything on its own.

Card types are as follows:

Custom . This map is designed to form manually, at the request of the user.
Layer 2 topology . This is a map that allows you to automatically do everything for the user, based on OSI model level 2 (in fact, builds a map based on the mac-addresses of devices).
IP topology , also known as Layer 3 . The same as in the case of Layer 2 , only the map is based on the 3 level OSI model (IP address).

When choosing manual mapping, you need to drag the nodes from the tree onto the map layer, connect them, etc. Normal drawing.

In two other cases, you need to choose a reference point from which the system will begin to build a map. For example, the node of some router. The system reads data from it about the connected devices and the ports to which they are connected, and then reads data from the next router, maps, and so on. After analyzing all the devices, NetXMS itself draws connections between network devices with an indication of the ports where it is connected. You can also specify that the system places the end devices (printers, PCs, servers) on the card, simply by pointing to the appropriate item in the card properties ( right click - properties - card properties ). Also in this paragraph, you can set the "radius of detection" if desired or necessary.

Little recommendation. Turn on the Always fit layuot to screen property so that every time you update the network map you do not have to scroll through the workspace and work with the zoom.

With a distributed network topology, you can overlay objects on a terrain map. It is also possible to use, for example, a photo of a rack with equipment in the server room as the background substrate, by imposing the necessary objects on it. Thus, you can quickly determine the status of any device in the rack. In my opinion, this is an extremely useful thing, if necessary, to transfer cases to a successor or a newly arrived colleague to explain the situation. Visually, conveniently, self-evident.

3.4. Agents

Agents exist for both Windows and Linux. The principle of operation is the same as that of the agents of the same Zabbix: installed on the machine from which you want to remove the metrics, and transmits the data to the monitoring server. The agent can be installed either before adding the node to the system, or after. After the appearance of the agent on the node, the system after some time will begin to receive data from it, applying the standard template. You can add tracked parameters both for one node and by hooking to it one or several predefined templates. It is recommended, of course, to work with templates, since this setting is more flexible.

The list of monitored parameters for agents is quite extensive. The full list can be viewed on the NetXMS Wiki.

As soon as the system determines that an agent is functioning on the node, it will immediately attach this node to one of the default templates - Windows, Linux, HP-UX, Generic UNIX, AIX , depending on the node's operating system.

Standard template for Windows hosts:

You can add a parameter to track a specific node by clicking " PCM-Data Collection Parameter ", then " PCM in the workspace - New parameter ", and select the desired one from the list. Triggers for the monitored parameter can also be configured here.

It is important that agents, like Zabbix, are able to run custom scripts.

How to install?

This is offtopic, as everyone ... hmmm ... as he wants.

Here the question is - is it necessary to distribute agents only to servers, or in general to all machines on the network? If the servers, and their little - the easiest hands. If a lot - the policy domain, KIX, PoSh. Anything Agents have the / SILENT key, which allows you to “inject” it transparently to the user (if we are talking about distribution to all the machines on the network). I prefer scripts, as you can set a condition and create a flag, in contrast to the policies that will put the program every time you turn on the PC. So, we write a script from the series “check for the presence of the flag file, if not, run the installer with the / SILENT key and, after successful installation, create a flag file”. Profit

3.5. Triggers

Triggers can be configured to respond to specific events in the collected data. For example, exceeding a certain ping-pong value to the router. The alert method is also configured for each trigger (for some, the icon in the console is sufficient, and some send alerts wherever possible). The reaction to the event is selected from those configured in the Action Configuration.

To create a trigger: enter the Data Collection Cnfiguration of a node or template, double-click on the parameter of interest (if not, create it first), and in the properties go to the Thresholds tab, click the add button.

" Function " - what or what values to take to calculate the condition. It can take the following values (the number of values used for testing is indicated in the Samples field):
- Last polled value: last received value
- Average value: the average value of the last
- Mean deviation: average deviation
- Diff with previous value: the difference with the previous value
- Data collection error: error getting data
- Sum of values: sum of values
- Script: value resulting from the user script

" Operation " - trigger trigger condition. It can take values (the value is indicated in the corresponding field Value):
- <: less than: less than
- <=: less than or equal to: less than or equal
- == equal to: equivalent
-> =: great than or equal to: greater than or equal
->: great than: more than
-! =: Not equal to: not equivalent
- like: approximate to
- not like: not close to
At the expense of the last two parameters, I have no confidence in how they work. For this I will not even guess what boundaries are built into them.

In a word, everything is the same as in the same Zabbix, just do not need to enter the formulas with hands, fearing to make a mistake or to make mistakes. To restore the trigger (deactivation event), the system will itself substitute the value. (In Zabbix again, you need to print and not be sealed).

3.6. Templates

RMB-create a template, enter the name of the template. PCM-Data Collection Cnfiguration.

Some theory

Information collected may come from several sources. This can be a NetXMS agent running on the machine, it can be a " simple request " or " Internal " sent by the server itself (for example, the same ping), " SNMP ", " check point SNMP ", " Push ", " Windows Performance Counters " , " SM-CLP ", or a custom " script ". For each type of information, you can select the type that the system will expect upon receipt. For example, ping will return an integer value (integer), the amount of free disk space in bytes will also be an integer, but the same temperature will no longer be an integer, that is, real).

To collect information via SNMP, NetXMS has a huge list of predefined parameters, that is, you can not enter a specific OID, but find it in the MIB tree, which is actually quite problematic ... But. As a result, you need to dig into the manufacturer's documentation and look for IOD values for each device separately. You can kill a lot of time on setting up the reading of information from devices, running across a message from the UNSUPPORTED system , and in the end it turns out that this particular device simply does not know how to provide the requested information. There remains only to smoke manuals ~~and hope~~ .

For example, to collect information about node response time (ping), you need to add a new parameter ( PCM-new parameter ), select Internal in the Origin field, press the Select button, and select Ping Time. There is a search, as I only dreamed of in Zabbix. It also makes sense to change the polling period of the node (the default is 60 seconds, that is, if the router stops responding immediately after it was polled by the monitoring system, it will take almost a full minute before the system realizes that something is wrong with it so, and will sound the alarm).

List of Internal data that can be processed:

After all the necessary settings have been added to the template, you need to click PCM-apply and select all the nodes to which this template should be applied ( ctrl + LMB ). And that's all. After that, a branch will appear in the tree under the template, and if you open it, you can see all the nodes to which this template is applied.

When detaching from a node template, the collected data does not cease to be collected; to cancel their collection, you need to go to the Data Collection Configuration for each node and delete everything manually, or first remove the parameters from the template and then delete the node from it. I do not know why it was done this way, but, in my opinion, this is an extremely strange thing. But ... as it is.

In the template, you can select the desired data to be collected, and copy it to another template. Massively. And do not copy-paste from one template to another with your hands ... ~~I am already starting to hate Zabbix ..~~

3.7. Dashboards

Oddly enough, but the NetXMS Wiki on this matter contains only one link to youtube: Creating a Dashboard , there is not even a list and description of the elements that can be added to the Dashboard. In principle, everything is still intuitive. Label is just ... well ... Label ... I don't even know how to translate it. In a word, the field with the inscription is used to designate and delimit the elements, since the elements themselves do not contain names. Line Chart is a line chart. Bar Chart is a chart. Alarm Viewer - list of alerts for the selected node. In short, the names speak for themselves.

Next you need to open the dashboard properties and set up views. We are interested in the Dashboard Elements tab. Number of columns is the number of columns that will be horizontally arranged. Each object has its own properties, and I would recommend unchecking the item Layout-Grab execessive vertical space to minimize the size of each element.

3.8. Action Configuration

(Configuration-Actions Configuration)
In essence, these are actions in response to triggering triggers.

NetXMS can send e-mail, sms, xmpp message, in response to triggering. Everything is trivial - just fill in the required fields.

You can also:

- Send the alert to another NetXMS server
- Run a command on a NetXMS administration server
- Execute the command on the node using the NetXMS agent
- Run the NXSL script (NSXL is its own programming language NetXMS )

After creating an action, you can adjust the execution delay time for it, as well as apply a macro for execution. The delay time is needed in order not to disturb the operator if the problem resolves itself (for example, the response time to the ping request was greater than the specified value, but with the next check it may return to normal, so why bother with such trifles?) .

3.9. Differentiation of access rights

For each object, you can configure your access rights. For each object inside another object, you can also configure access rights. This allows very flexible distribution of duties and monitoring capabilities for different groups of users. That's all I wanted to say.

4. Conclusion

I would like to refute the title of my own article. NetXMS is not for the lazy. Just the developers did not focus on the functionality, ~~as some,~~ but, among other things, took care of the same interface.

Let's do this. I used Zabbix and NetXMS. By this I will simply write my feelings from these two systems.

So.

I see NetXMS as a very user friendly system. Well ... to the admin ... yes, to the admin. She has what is called an intuitive interface, and that's great. You do not need to write your own formulas, read a bunch of supporting documentation, to enable checking pings to switch or some other simplest thing. But. In Zabbix, this very simplest thing is not much different from something really difficult, that is, if you have mastered it, you will do the same. NetXMS has hints, most of the parameters can be found in the interface, selected with the mouse and configured. In Zabbix, you need to know if the system can do this and that, and if so, how it is spelled and what parameters it returns. Zabbix wins in the community and tons of templates for all occasions ... if you can find what you need in this mountain of rubbish ... On the other hand, NetXMS also has a lot of fans and their additions to this system. Yes, and the system itself out of the box can monitor the most popular applications, as stated on the main page of the project.

The main advantage of NetXMS against Zabbix for me was the ability to run a script on the problem node in response to the trigger. The CPU temperature has risen above the threshold - it will quench the host itself. And then we will understand - what is the problem. Perhaps there is such a thing in Zabbix too, but I could not. Same as port monitoring on routers. And here itself ... Well, as itself ... It is necessary to master the scripting language, and that's it ...

Access rights in NetXMS allow very flexible differentiation of the capabilities of different users of the system. But in order to adjust all this you need to tinker straight.

On the whole, the systems are quite similar, and the possibilities of them, at first glance, seem at least comparable. Although it is felt that Zabbix can do more, but these opportunities lie deep under the hood and for the mere mortal admin “all in one” are unattainable. NetXMS, on the other hand, seems simple and clear, friendly and pleasant.

Zabbix has excellent documentation, wiki, how to, FAQ, best practice, as well as all sorts of forums, etc., where you can find answers to almost all questions. And if that - ask your own.

In summary, I would like to recommend NetXMS for people who are just starting out or for those who do not need to dive deep into monitoring everything and everyone. I am sure that if you deal with it further, you can wind up very, very subtle observations in it.

But still, the main advantage I see is a very nice interface and easy setup.

5. Materials used

youtube channel Tomas Kirnak
NetXMS Wiki
Google search

Source: https://habr.com/ru/post/443938/

All Articles