📜 ⬆️ ⬇️

QoS in Linux: mocking traffic

In the previous article I talked about the U32 filter. This article focuses on the so-called tc actions - actions that can be performed on traffic. For example, you can build a firewall without using iptables / netfilter, or you can change individual bytes in packets, redirect / mirror traffic to other interfaces. We will master this by examples. Continued under the cut.


What are these tc actions like?

Traffic Control Action (hereinafter simply “actions”) is an extension of filters in the traffic management subsystem. These extensions are needed for a variety of needs - from the simplest dropping of packets to changes in the traffic itself. The action is attached to a separate filter, and thus the manipulation is performed only on the selected traffic, which adds flexibility. In addition, you can build entire chains of actions via pipes (like pipelining of data in the console), combining them. Manipulations can be performed both on incoming traffic and outgoing traffic.

First of all, we need to add class or classless discipline to the interface, and filters with actions will be added to it. If we want to mock incoming traffic, then we need to add ingress discipline. Its distinguishing feature is that its handle is always equal to “ffff:” and it is always classless.
')
Naturally, the corresponding modules should be included in the kernel. They are located in the Networking support - Networking options - QoS and / or fair queueing branch. You need the included options Actions and modules with the actions you will use. In distribution kernels, usually, everything is already included.

The simplest example of using actions


To simplify the construction of filters, we will filter traffic for manipulations using tags. This method is only suitable for outgoing traffic. Why is that? Let's look at this picture , which shows the path of the package through the Linux network stack. As you can see, the discipline and classification of incoming packets is performed much earlier than any netfiter hooks, and therefore we simply have no place to tag the packet before. In this case, for classification it makes sense to build filters according to other criteria, for example, using U32. Another way to get around this problem is to redirect traffic to another interface.

Let's look at the simplest example of applying actions. Surely, many have already faced him. It will be about bandwidth limitations for certain types of traffic using the so-called polisher. Polyser works according to the algorithm of the current bucket (you can read about this algorithm on Wikipedia or at Tanenbaum).

Suppose we want to limit the speed of the incoming traffic of the tcp protocol from the ip address 192.168.10.3 to the address 192.168.10.5. You can do this as follows:
#   #  tc qdisc add \ dev eth0 \ ingress #    # tcp #  192.168.10.3/32 #  192.168.10.5/32 tc filter add \ dev eth0 \ parent ffff: \ pref 10 \ protocol ip \ handle ::1 \ u32 \ match ip protocol 6 0xff \ match ip src 192.168.10.3/32 \ match ip dst 192.168.10.5/32 \ action police \ rate 2Mbit burst 200K exceed-conform drop 


The last two lines are of the greatest interest to us (if other lines are not clear to you, then read LARTC and about the U32 filter).


Run, for example, iperf on both machines and measure the speed. If everything is done correctly, then the speed from 192.168.10.3 to 192.168.10.5 should be in the region of two megabits (this is the case if nothing is passed between the test data between the nodes). In statistics, you can see how much data went through the filter, how many times it worked, how many packets were skipped and dropped, etc.

 ~$ iperf -s -p 10500 ------------------------------------------------------------ Server listening on TCP port 10500 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 192.168.10.5 port 10500 connected \ with 192.168.10.3 port 59154 [ ID] Interval Transfer Bandwidth [ 4] 0.0-11.2 sec 2.73 MBytes 2.04 Mbits/sec ~$ tc -s -pf ls dev eth0 parent ffff: filter protocol ip pref 10 u32 filter protocol ip pref 10 u32 fh 800: ht divisor 1 filter protocol ip pref 10 u32 fh 800::1 \ order 1 key ht 800 bkt 0 terminal flowid ??? \ (rule hit 2251145 success 4589) match IP src 91.193.236.62/32 (success 5843 ) match IP dst 91.193.236.44/32 (success 4608 ) match IP protocol 6 (success 4589 ) action order 1: police 0x1e rate 2000Kbit burst 200Kb mtu 2Kb \ action drop overhead 0b ref 1 bind 1 Action statistics: Sent 6870220 bytes 4589 pkt (dropped 761, overlimits 761 requeues 0) backlog 0b 0p requeues 0 


Other actions are similarly used. For each action, you can call a little help on the parameters. For example, for the same polisher, this can be done with the command:

 tc filter add \ dev eth0 \ parent ffff: \ u32 \ match u32 0 0 \ action police \ help Usage: ... police rate BPS burst BYTES[/BYTES] [ mtu BYTES[/BYTES] ] [ peakrate BPS ] [ avrate BPS ] [ overhead BYTES ] [ linklayer TYPE ] [ ACTIONTERM ] Old Syntax ACTIONTERM := action <EXCEEDACT>[/NOTEXCEEDACT] New Syntax ACTIONTERM := conform-exceed <EXCEEDACT>[/NOTEXCEEDACT] Where: *EXCEEDACT := pipe | ok | reclassify | drop | continue Where: pipe is only valid for new syntax 


In order to find out a hint to other actions, simply indicate their name instead of “police”.

Short list of actions


The following actions are currently included in the kernel:


Chaining actions


Actions can be applied both individually and together, forming chains. All this is similar to pipelining of data in the console, when the output of one program is fed to the input of another. With actions in the same way. For example, let's try to change some field in the package header. After that we will need to recalculate and update the checksum. For this, the pedit and csum actions will be chained. For clarity, we mirror the resulting packets on the ifb0 interface and see them tcpdump-ohm.

 tc filter add \ dev eth0 \ parent 1: \ pref 10 \ protocol ip \ handle ::1 \ u32 \ match ip protocol 6 0xff \ match ip src 10.10.20.119/32 \ match ip dst 10.10.20.254/32 \ match u16 10500 0xffff at 22 \ action pedit \ munge offset 22 u16 set 11500 \ pipe \ action csum \ tcp \ pipe \ action mirred \ egress mirror dev ifb0 


The team looks pretty scary. We are familiar with the beginning - we add a filter in order to select the packets we need by source and destination addresses, protocol and port number (protocol tcp, ip src 10.10.20.119, ip dst 10.10.20.254, tcp dport 10500). But instead of classifying, we change the contents of the packet (parameter “action pedit”) - a single word at an offset of 22 bytes from the beginning of the ip-packet. If you look at the format of the headers, then this field corresponds to the port number of the receiver in tcp. We overwrite it by setting it to 11500 ("munge offset 22 u16 set 11500"). But after we changed the field, the header checksum will change. To recalculate it, packets are redirected to the csum action using the “pipe” parameter. Csum recalculates the tcp header checksum and forwards packets to the mirred action in the same way using the pipe parameter. As a result of the “mirred” action, copies of the packets that were sent come to the ifb0 interface.

Check how everything works by analyzing statistics, as well as running tcpdump on the ifb0 interface:

 #      ~$ tc -s -pf ls dev eth0 filter parent 1: protocol ip pref 10 u32 filter parent 1: protocol ip pref 10 u32 fh 800: ht divisor 1 filter parent 1: protocol ip pref 10 u32 fh 800::1 order 1 key ht 800 bkt 0 terminal flowid ??? (rule hit 102554 success 0) match IP protocol 6 (success 102517 ) match IP src 10.10.20.119/32 (success 0 ) match IP dst 10.10.20.254/32 (success 0 ) match dport 10500 (success 0 ) action order 1: pedit action pipe keys 1 index 66 ref 1 bind 1 installed 132 sec used 132 sec key #0 at 20: val 00002cec mask ffff0000 Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 action order 2: csum (tp) action pipe index 29 ref 1 bind 1 installed 132 sec used 132 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 action order 3: mirred (Egress Mirror to device ifb0) pipe index 79 ref 1 bind 1 installed 132 sec used 132 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 #  tcp  10.10.20.254:10500 ~$ telnet 10.10.20.254 10500 #    ,    #   ifb0 ~$ tcpdump -nvvi ifb0 tcpdump: WARNING: ifb0: no IPv4 address assigned tcpdump: listening on ifb0, link-type EN10MB (Ethernet), capture size 65535 bytes ... 00:46:11.080234 IP (tos 0x10, ttl 64, id 46378, offset 0, flags [DF], proto TCP (6), length 60) 10.10.20.119.36342 > 10.10.20.254.11500: Flags [S], cksum 0x2001 (correct), seq 1542179969, win 14600, options [mss 1460,sackOK,TS val 1417050539 ecr 0,nop,wscale 4], length 0 ... #    ~$ tc -s -pf ls dev eth0 filter parent 1: protocol ip pref 10 u32 filter parent 1: protocol ip pref 10 u32 fh 800: ht divisor 1 filter parent 1: protocol ip pref 10 u32 fh 800::1 order 1 key ht 800 bkt 0 terminal flowid ??? (rule hit 580151 success 12) match IP protocol 6 (success 579716 ) match IP src 10.10.20.119/32 (success 12 ) match IP dst 10.10.20.254/32 (success 12 ) match dport 10500 (success 12 ) action order 1: pedit action pipe keys 1 index 66 ref 1 bind 1 installed 747 sec used 454 sec key #0 at 20: val 00002cec mask ffff0000 Action statistics: Sent 888 bytes 12 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 action order 2: csum (tdp) action pipe index 29 ref 1 bind 1 installed 747 sec used 454 sec Action statistics: Sent 888 bytes 12 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 action order 3: mirred (Egress Mirror to device ifb0) pipe index 79 ref 1 bind 1 installed 747 sec used 454 sec Action statistics: Sent 888 bytes 12 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 


That's basically all that I wanted to tell about the application of action.

useful links

LARTC - Linux Advanced Routing and Traffic Control.
An example of using ifb and mirred actions.

Source: https://habr.com/ru/post/138562/


All Articles