📜 ⬆️ ⬇️

We collect NetFlow cheaply and angrily

TL; DR : The author has assembled a NetFlow / sFlow collector from GoFlow , Kafka , ClickHouse , Grafana and a crutch on Go.


Hello, I am an exploiter and I love to know what is happening in the infrastructure. And I also love to climb into someone else's business, and this time I climbed into the net.


Suppose you have your network equipment and a bag of monoliths sticking to the Internet, microservices and microservices monoliths with their dependencies in the form of databases, caches and FTP servers. And sometimes some of the inhabitants of this bag begin to misbehave on the net.


Here are just some examples of such pranks:



SNMP counters from switch ports or VMs will only give an approximate picture of what is happening, but I want accuracy and speed of problem analysis. The NetFlow / IPFIX and sFlow protocols come to the rescue, which generate rich traffic information directly from the network equipment. It remains to put it somewhere and somehow process it.


From the available NetFlow collectors, the following were considered:



A selected was described in the presentation of Louis Poinsignon at RIPE 75 . The general scheme of a simple collector is as follows:



GoFlow parses NetFlow / sFlow packages and puts them into a local Kafka in protobuf format. The self-written “shovel” goflow2ch takes the messages out of Kafka and puts them in Clickhouse in batches for better performance. The scheme does not address the issue of high availability at all, but for each component there are either standard or more or less simple external means of its provision.


Tests have shown that the CPU cost of parsing and saving the same 5000 threads per second is about a quarter of the CPU core, and the disk space occupied is on average 11-14 bytes per slightly truncated stream.


To display information, use either a Web UI for ClickHouse called Tabix , or a plugin for Grafana .


Advantages of the scheme:



Cons too decent:



Features encountered:



As a result, a tool for monitoring the situation in the network, in real-time plus or minus, as well as in historical perspective, was obtained from open source components and blue electrical tape. Despite his knee pads, he has already helped reduce the time to resolve several incidents at times.


')

Source: https://habr.com/ru/post/424321/


All Articles