📜 ⬆️ ⬇️

Icinga in action. Monitoring of the Large Hadron Collider at CERN, Switzerland / France

CERN and Icinga

CERN is the European Center for Nuclear Researchers, and in addition it is the collision of particles with a frequency of 40 MHz and 11000 revolutions at the collider per minute. The Large Hadron Collider CERN is the largest and most powerful particle accelerator in the world. Icinga is a free, open source enterprise monitoring system. For its part, Icinga assists the stable operation of the LHC equipment in three of the four detection sites. This equipment is looking for differences between matter and antimatter, as well as further confirmation of the existence of the Higgs boson and checks the models of modern physics, as we know it today.


CERN is one of the largest and most respected research centers in the world. He is engaged in fundamental physics, the search for the fundamental principles of the Universe and the laws of its existence. At CERN, the largest and most complex scientific instruments are used to study the constituent elements of matter. Particle accelerators accelerate particle fluxes to high energies, until they collide with each other or with stationary targets. Detectors record and record the results of these collisions. Founded in 1954, the CERN laboratory is located on the Franco-Swiss border near Geneva. It was one of the first European joint ventures in which, at present, 20 states are participating.

For more information about the activities of CERN and equipment experiments described in the article Mgrin CERN - what is the organization for $ 900 million .
')
At a depth of 100 m under the Franco-Swiss border there is a 27-km ring, better known as the Large Hadron Collider (BAC, Large Hadron Collider - LHC), which pushes sub-atomic particles with an energy of 14 TeV. Detectors located at 4 sites, with a total mass of up to 12,000 tons, record the data of experiments in which attempts are made to reveal the original causes of the existence of matter and anti-matter, the existence of the Higgs boson, additional dimensions of our space, among others, is verified. To maintain order and understanding of the processes, Icinga is monitoring three of these sites: LHCb, CMS and ATLAS (Fig. 1):



Matter against antimatter: monitoring

The experiment equipment LHCb (Large Hadron Collider Beauty) is 21 meters long, 13m wide and 10m high. From it comes the 60GB / s data stream, which contains information about the origin of matter and anti-matter. The control system and data acquisition chains form an information skeleton of an experiment that runs on machines running Windows and Linux, as well as embedded processors.

Initially, monitoring was carried out by a single site Nagios. However, as the CERN IT team tried to scale the solution, problems began to surface: the average service check delay of 328 seconds was too great. A new solution was required and the administrators turned to Icinga and its active community.

Due to configuration compatibility, migration from Nagios was relatively uncomplicated. However, in order to facilitate future solution support, the configuration files were reorganized, groups and inheritance between hosts became fully used. Thus, adding a new monitoring object to an existing category of DBMS server type, settlement node, storage system, etc. only changed one configuration file

Now the LHCb experiment is monitored with one instance of Icinga set in failover mode. It works in conjunction with mod-Gearman executing processes, remote NRPE agents and NSClient ++. In addition, in addition to SNMP checks and specialized performance measurements, several specialized GPFS type checks and file system checks are added.

The central server Icinga is engaged in scheduling checks that 60 Mod-Gearman distributed execution processes extract from their queues, execute them, and then place the results in another queue. (Fig.2). In the new installation, a single instance of the Icinga monitoring system is able to track a vast environment of over 2000 hosts and 40,000 services. The service check delay has decreased from 328 seconds and is now less than one second.


How to check the Higgs boson


On the second and third sites are detectors of CMS (Compact Muon Solenoid, CMS) and ATLAS (- An Toroidal LHC Apparatus, Toroidal BAK Apparatus) experiments equipment CMS, trying to determine the presence of Higgs boson, find other dimensions of space and dark matter.

In the CMS experiment, Icinga monitors the status of 3000 hosts and 70 switches using one centralized monitoring site. It employs one mod-gearman, NRPE and check_multi executive process. With their help, Icinga processes the results of 90000 checks every 2 minutes. There are a variety of checks, ranging from monitoring network utilization, errors and free disk space to monitoring the status of RAID arrays, equipment temperature and other special services, so Icinga looks after the entire complex of existing equipment.

In the ATLAS experiment, two instances of Icinga are deployed, which are running on virtual machines and work side by side with Nagios. With a total number of hosts in 3000, Icinga servers monitor 90 critical sites on both networks. Monitoring helps ATLAS to maximize the use of beam time at the collider, and collect the greatest possible amount of data for physicists.

Extensions for the future


Already, there are plans for a complete migration of the ATLAS experiment monitoring system to Icinga, mod-gearman and ganglia, which will allow monitoring 3000 hosts and performing 100,000 checks at a time. They will include hardware monitoring via IPMI, and will most likely work on one central monitoring system installation with the mod-gearman execution process, like other icinga installations.

The extension of Icinga monitoring in the CMS is also underway. It is planned to create a greater number of dedicated services for monitoring the currently added software on which the experiment is based. In expanding the boundaries of Icing monitoring, the IT CERN team can be confident that they will have the best performance in monitoring the LHC and the experiments will be truly real science. A curious fact - monitoring icinga already played its role behind the scenes when the Higgs boson was discovered. And as the LHC and its equipment continue to push the particles and gather data without hindrance, Icinga will continue to work on science and the upcoming discoveries.

Source: https://habr.com/ru/post/170135/


All Articles