📜 ⬆️ ⬇️

Automation IP-network. Part 3 - Monitoring TCP Anomalies

image alt In the previous article, the monitoring of the speed of opening Web resources was considered. As a parallel process in measuring speed, it would be interesting to measure TCP anomalies in order to gain a deeper understanding of the possible reasons for the low speed of opening Web pages. We will try to solve this problem in this article.

TCP anomalies will be considered packets, which indicate a loss of information in the transmission process. Perhaps the most popular tool for in-depth analysis of network traffic is Wireshark and its console version tshark. Therefore, we will consider it as the initial analyzer.
The following will be referred to TCP anomalies:


For packet analysis using tshark we will use the following expression:
')
tshark -i bce0 -t ad -qz io,stat,5,"(ip.addr==1.1.1.1) && tcp","COUNT(tcp.analysis.retransmission)(ip.addr==1.1.1.1) && tcp.analysis.retransmission","COUNT(tcp.analysis.duplicate_ack)(ip.addr==1.1.1.1) && tcp.analysis.duplicate_ack","COUNT(tcp.analysis.lost_segment)(ip.addr==1.1.1.1) && tcp.analysis.lost_segment","COUNT(tcp.analysis.fast_retransmission)(ip.addr==1.1.1.1) && tcp.analysis.fast_retransmission","COUNT(tcp.analysis.lost_segment)(ip.addr==1.1.1.1) && tcp.analysis.ack_lost_segment") 

where
bce0 is the name of the interface on which the packet analysis will be performed. This name in linux / Unix system can be seen with the command ifconfig.
1.1.1.1 - IP address of the studied resource

As a result of the output of the command, we obtain a table, which we will subsequently process and load into the database for plotting graphs:

Table
 ====================================================================================================== | IO Statistics | | | | Duration: 5. 40977 secs | | Interval: 5 secs | | | | Col 1: (ip.addr==1.1.1.1) && tcp | | 2: COUNT(tcp.analysis.retransmission)(ip.addr==1.1.1.1) && tcp.analysis.retransmission | | 3: COUNT(tcp.analysis.duplicate_ack)(ip.addr==1.1.1.1) && tcp.analysis.duplicate_ack | | 4: COUNT(tcp.analysis.lost_segment)(ip.addr==1.1.1.1) && tcp.analysis.lost_segment | | 5: COUNT(tcp.analysis.fast_retransmission)(ip.addr==1.1.1.1) && | | tcp.analysis.fast_retransmission | | 6: COUNT(tcp.analysis.lost_segment)(ip.addr==1.1.1.1) && tcp.analysis.ack_lost_segment | |----------------------------------------------------------------------------------------------------| | |1 |2 |3 |4 |5 |6 | | | Date and time | Frames | Bytes | COUNT | COUNT | COUNT | COUNT | COUNT | | |-------------------------------------------------------------------------------| | | 2017-07-10 15:00:45 | 507 | 481496 | 1 | 0 | 2 | 0 | 0 | | | 2017-07-10 15:00:50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | ====================================================================================================== 

In the quality of tools as in the previous article, we will use Cacti and Python3. We are upgrading the script from the previous article, to measure the speed and TCP anomalies:

Python code
 #!/usr/bin/env python # -*- coding: utf-8 -*- import datetime import re import os import subprocess import argparse import time import signal parser = argparse.ArgumentParser() parser.add_argument("-h_page", "--hostname_page", dest = "hostname_page") args = parser.parse_args() curent_time=str(datetime.datetime.now().strftime("%Y-%m-%d_%H:%M:%S_")) pid=os.getpid() ##########start Table parser############### def parser_tshark_output(open_time, parser_file): table_data = {'date':[],'1 frame':[],'2 frame':[],'retr':[],'dup_ack':[],'lost_seg':[],'fast_retr':[],'ack_lost_seg':[]} lines = open(parser_file, 'r').readlines() lookup = 'tcp.analysis.ack_lost_segment ' number = 0 for num in lines: if lookup in num: number+=lines.index(num) try: del (lines[-1]) except: pass L=open("/usr/TEST/TMP_FILES/test_tshark_temp_"+str(os.getpid())+".txt", 'w') L.writelines(lines) L.close() with open("/usr/TEST/TMP_FILES/test_tshark_temp_"+str(os.getpid())+".txt", 'r') as table: if number != 0: for _ in range(int(number+5)): next(table) #skip header for row in table: row=row.strip('\n').split('|') values = [r.strip() for r in row if r != ''] table_data['date'].append(values[0]) table_data['1 frame'].append(int(values[1])) table_data['2 frame'].append(int(values[2])) table_data['retr'].append(int(values[3])) table_data['dup_ack'].append(int(values[4])) table_data['lost_seg'].append(int(values[5])) table_data['fast_retr'].append(int(values[6])) table_data['ack_lost_seg'].append(int(values[7])) else: pass if number !=0: frames=sum(table_data['2 frame']) print ('start-frames_' + str.format("{0:.2f}", frames)+'_end-frames') tcp_errors=sum(table_data['retr'])+sum(table_data['dup_ack'])++sum(table_data['fast_retr'])+sum(table_data['ack_lost_seg']) print ('start-tcp_errors_' + str(tcp_errors)+'_end-tcp_errors') if open_time != 'open_error' and frames != 0: k=tcp_errors else: k= 'no data' else: k= 'no data' os.remove("/usr/TEST/TMP_FILES/test_tshark_temp_"+str(os.getpid())+".txt") return (k) ###########end table parser############### ###########start tshark part 1########### resault_temp=subprocess.Popen(['nslookup '+str(args.hostname_page)], bufsize=0, shell=True, stdout = subprocess.PIPE, stderr=subprocess.PIPE) data=resault_temp.communicate() ip_adr_temp=re.findall(r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})', str(re.findall(r'Address: (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})', str(data)))) if ip_adr_temp != []: pass else: ip_adr_temp = ['1.1.1.1'] ip_adr='' for z in ip_adr_temp: if len(ip_adr_temp)-ip_adr_temp.index(z)==1: ip_adr+='ip.addr=='+str(z)+')' else: ip_adr+='ip.addr=='+str(z)+' or ' ftcp=open('/usr/TEST/TMP_FILES/1tshark_temp'+curent_time+str(pid)+'.txt', 'w') tcp=subprocess.Popen(['timeout 180 tshark -i bce0 -t ad -qz io,stat,5,"('+str(ip_adr)+' && tcp","COUNT(tcp.analysis.retransmission)('+str(ip_adr)+' && tcp.analysis.retransmission","COUNT(tcp.analysis.duplicate_ack)('\ +str(ip_adr)+' && tcp.analysis.duplicate_ack","COUNT(tcp.analysis.lost_segment)('+str(ip_adr)+' && tcp.analysis.lost_segment","COUNT(tcp.analysis.fast_retransmission)('+str(ip_adr)+\ ' && tcp.analysis.fast_retransmission","COUNT(tcp.analysis.lost_segment)('+str(ip_adr)+' && tcp.analysis.ack_lost_segment"']\ , bufsize=0, shell=True, stdout=(ftcp))#stdout = subprocess.PIPE, stderr=subprocess.PIPE) time.sleep(2) ############start wget ################ fweb=open('/usr/TEST/TMP_FILES/web_temp'+curent_time+str(pid)+'.txt', 'w') web=subprocess.call(["timeout 120 wget -E -H -p -Q300K --user-agent=Mozilla --no-cache --no-cookies --delete-after --timeout=15 --tries=2 "+args.hostname_page+" 2>&1 | grep '\([0-9.]\+ [KM]B/s\)'"], bufsize=0, shell=True, stdout=(fweb)) fweb.close() fweb=open('/usr/TEST/TMP_FILES/web_temp'+curent_time+str(pid)+'.txt', 'r') data=fweb.read() os.remove('/usr/TEST/TMP_FILES/web_temp'+curent_time+str(pid)+'.txt') speed_temp=re.findall(r's \((.*?)B/s', str(data))#[KM]B/s', str(data))) speed_temp_si=re.findall(r's \((.*?) [KM]B/s', str(data)) try: if re.findall(r'M', str(speed_temp))==[] and re.findall(r'K', str(speed_temp))==[]: speed_="{0:.3f}".format(float(speed_temp_si[0])*0.001*8) elif re.findall(r'M', str(speed_temp))!=[]: speed_="{0:.3f}".format(float(speed_temp_si[0])*1000*8) elif re.findall(r'K', str(speed_temp))!=[]: speed_="{0:.3f}".format(float(speed_temp_si[0])*1*8) except: speed_='no_data' ##############stop wget############## ##############start tshark part2####### os.kill(tcp.pid, signal.SIGINT) ftcp.close() time.sleep(0.3) tcp_error=parser_tshark_output('1', '/usr/TEST/TMP_FILES/1tshark_temp'+curent_time+str(pid)+'.txt') os.remove('/usr/TEST/TMP_FILES/1tshark_temp'+curent_time+str(pid)+'.txt') #########resault to DB########### print ('web_speed_test:'+str(speed_)+' tcp_error:'+str(tcp_error)) 


Running the script should show the following (tshark, nslookup, wget utilities must be installed on the operating system):

 $python3.3 web_open.py -h_page habrahabr.ru web_speed_test:10960.000 tcp_error:2.0 

Next, brief instructions on how to upgrade the data in Cacti to get two graphs from one RRA:

Cacti
1. Add additional output Fields to Data Input Methods:



2. Add an additional area to Data Temlate tcp_error



3. Copy the Graph template and add additional information to it; when setting up new graphs, initially you need to use this Template:



4. Set up a graph as shown in the previous article using the Template above, duplicate the graph:



5. The result should be the following:



6. Change the data in the first graph to the right. First, change the Template, click save, then check that the Data Source is correct, if necessary, select the correct one:



7. Change the data for the second graph:




If everything is done correctly, you should get the following graphs





That's all, further according to the plan with positive feedback, in the following articles we will consider the modernization of this statistics.


Thank you for your time.

Source: https://habr.com/ru/post/333768/


All Articles