📜 ⬆️ ⬇️

Routing and policy-routing in Linux with iproute2

The article will focus on routing network packets in Linux. And specifically - about the type of routing called policy-routing (routing based on policies). This type of routing allows you to route packets based on a number of fairly flexible rules, in contrast to the classic destination-routing routing mechanism (routing based on the destination address). Policy-routing is used when there are several network interfaces and the need to send certain packets to a specific interface, and the packets are determined not by the destination address or not only by the destination address. For example, policy-routing can be used for: balancing traffic between several external channels (uplinks), providing access to the server in the case of several uplinks, sending packets from different internal addresses through different external interfaces, if necessary, even for sending packets to different TCP ports through different interfaces, etc.
To manage network interfaces, routing and shaping in Linux, use the iproute2 utility package.

This set of utilities only sets the settings; in fact, all the work is done by the Linux kernel. To support policy-routing kernel, it must be compiled with IP options enabled : advanced router ( CONFIG_IP_ADVANCED_ROUTER ) and IP: policy routing ( CONFIG_IP_MULTIPLE_TABLES ) located in Networking support -> Networking options -> TCP / IP networking .

ip route


Use the ip route command to configure routing. Executed without parameters, it will show a list of current routing rules (not all the rules, more on this later):
  # ip route
 192.168.12.0/24 dev eth0 proto kernel scope link src 192.168.12.101
 default via 192.168.12.1 dev eth0 

This is what routing will look like when using the eth0 interface with the IP address 192.168.12.101 with the subnet mask 255.255.255.0 and the default gateway 192.168.12.1.
We see that the traffic on the subnet 192.168.12.0/24 goes through the interface eth0. proto kernel means that the routing was set automatically by the kernel when setting the IP interface. scope link means that this entry is valid only for this interface (eth0). src 192.168.12.101 specifies the sender's IP address for packets falling under this routing rule.
Traffic to any other hosts that do not fall into the 192.168.12.0/24 subnet will go to the 192.168.12.1 gateway through the eth0 interface ( default via 192.168.12.1 dev eth0 ). By the way, when sending packets to the gateway, the destination IP address does not change, just in the Ethernet frame, the MAC address of the gateway will be specified as the MAC address of the recipient (often even experienced professionals are confused at this moment). The gateway, in turn, changes the sender's IP address if NAT is used, or simply sends the packet further. In this case, the private address (192.168.12.101) is used, so the gateway most likely does NAT.
And now climbed into the routing deeper. In fact, there are several routing tables, and you can also create your own routing tables. Initially predefined tables are local , main and default . In the local table, the kernel records for local IP addresses (so that traffic to these IP addresses remains local and does not attempt to go to the external network), as well as for broadcasts. The main table is the main table and it is used if the command does not indicate which table to use (that is, we saw the main table above). The default table is initially empty. Let's take a quick look at the contents of the local table:
  # ip route show table local
 broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1
 broadcast 192.168.12.255 dev eth0 proto kernel scope link src 192.168.12.101
 broadcast 192.168.12.0 dev eth0 proto kernel scope link src 192.168.12.101
 local 192.168.12.101 dev eth0 proto kernel scope host src 192.168.12.101
 broadcast 127.0.0.0 dev lo proto kernel scope link src 127.0.0.1
 local 127.0.0.1 dev lo proto kernel scope scope src 127.0.0.1
 local 127.0.0.0/8 dev lo proto kernel scope scope host src 127.0.0.1 

broadcast and local define the types of recordings (we looked at the default type above). The type of broadcast means that the packets corresponding to this entry will be sent as broadcast packets, in accordance with the interface settings. local - packets will be sent locally. scope host indicates that this entry is valid only for this host.
To view the contents of a specific table, use the ip route show table TABLE_NAME . To view the contents of all tables, all , unspec or 0 should be specified as unspec . All tables actually have numeric identifiers, their symbolic names are specified in the / etc / iproute2 / rt_tables file and are used only for convenience.

ip rule


How does the kernel choose which table to send packets to? Everything is logical - there are rules for this. In our case:
  # ip rule
 0: from all lookup local
 32766: from all lookup main
 32767: from all lookup default 

The number at the beginning of the line is the rule identifier, from all is a condition, means packets from any addresses, lookup indicates which table to send the packet to. If a packet falls under several rules, then it passes them all in order of increasing identifier. Of course, if a packet falls under any routing entry, then subsequent routing entries and subsequent rules will not pass.
Possible conditions:

Conditions can be combined, for example from 192.168.1.0/24 to 10.0.0.0/8 , and you can also use the prefix not , which indicates that the packet should not meet the condition in order to fall under this rule.
So, we figured out what routing tables and routing rules are. And creating your own tables and routing rules is policy-routing , or PBR (policy based routing). By the way, SBR (source based routing) or source-routing in Linux is a special case of policy-routing, this is the use of the from condition in a routing rule.
')

Simple example


Now consider a simple example. We have a certain gateway, packets with IP 192.168.1.20 come to it. Packets from this IP need to be sent to the gateway 10.1.0.1. To do this, we do the following:
Create a table with a single rule:
  # ip route add default via 10.1.0.1 table 120 

Create a rule that sends the necessary packets to the desired table:
  # ip rule add from 192.168.1.20 table 120 

As you can see, everything is simple.

Server availability through several uplinks


Now for a more realistic example. There are two uplinks to two providers, it is necessary to ensure the availability of the server from both channels:

The default route is one of the providers, no matter which one. In this case, the web server will be accessible only through the network of this provider. Requests via another provider’s network will arrive, but the response packets will go to the default gateway and nothing will come of it.
This is solved very simply:
Define the tables:
  # ip route add default via 11.22.33.1 table 101
 # ip route add default via 55.66.77.1 table 102 

We define the rules:
  # ip rule add from 11.22.33.44 table 101
 # ip rule add from 55.66.77.88 table 102 

I think now it is not necessary to explain the meaning of these lines. Similarly, the server can be made accessible by more than two uplinks.

Balancing traffic between uplinks


It is done by one elegant team:
  # ip route replace default global \
   nexthop via 11.22.33.1 dev eth0 weight 1 \
   nexthop via 55.66.77.1 dev eth1 weight 1 

This entry will replace the existing default routing in the main table. In this case, the route will be selected depending on the weight of the gateway ( weight ). For example, when specifying weights 7 and 3, 70% of connections will go through the first gateway, and 30% through the second gateway. There is one thing that must be taken into account: the kernel caches routes, and the route for a host through a certain gateway will hang in the table for some time after the last access to this entry. A route to frequently used hosts may not have time to be reset and will be updated all the time in the cache, remaining on the same gateway. If this is a problem, you can sometimes clear the cache manually with the ip route flush cache command.

Using packet marking with iptables


Suppose we need the packets to port 80 to go only through 11.22.33.1. To do this, do the following:
  # iptables -t mangle -A OUTPUT -p tcp -m tcp --dport 80 -j MARK --set-mark 0x2

 # ip route add default via 11.22.33.1 dev eth0 table 102

 # ip rule add fwmark 0x2 / 0x2 lookup 102 

The first team is marking all the packages going to port 80. The second command is to create a routing table. The third team wraps all the packages with the specified marking in the desired table.
Again, everything is simple. Consider also the use of the iptables CONNMARK module. It allows you to track and label all packets related to a particular connection. For example, you can label packets for a particular attribute in the INPUT chain, and then automatically label packets related to these connections and in the OUTPUT chain. It is used like this:
  # iptables -t mangle -A INPUT -i eth0 -j CONNMARK --set-mark 0x2
 # iptables -t mangle -A INPUT -i eth1 -j CONNMARK --set-mark 0x4
 # iptables -t mangle -A OUTPUT -j CONNMARK --restore-mark 

Packages arriving with eth0 are labeled 2, and with eth1 - 4 (lines 1 and 2). The rule on the third line checks whether the packet belongs to a particular connection and restores the markings (which were set for incoming) for outgoing packets.
I hope the material presented will help you evaluate the full flexibility of routing in Linux. Thanks for attention :)

Source: https://habr.com/ru/post/108690/


All Articles