There is a great article explaining how this is done
on Cisco . But we do not want to spend $ 100,500 for the purchase of stamped impressions "Cisco Systems" on the body of the router.
Description of the problem
So, the essence of the problem: there are two NATs through two different providers, a local area network in which there is a server and which must be public and accessible through both NATs. Providers have different priorities: the first is activated first, then the second.
If a packet is entered through the first provider, it is NATed to our server, processed, a response packet is formed, which goes out through the first provider and goes to where the first packet came from. Good.
')
If a packet is entered through the second provider, it is NATed to our server, processed, a response packet is formed, which goes out through the first provider ... and why? Because first in Linux there is routing, and then SNAT. So, when routing a packet, the next node is assigned - the gateway of the first provider (by default). Then there is a connection tracking - conntrack notices that this packet is the answer to another, and replaces the sender's address with the address that the second provider gave us. And then the packet is routed through the interface of the first provider to its gateway. As a rule, the provider blocks packets with the sender address of which the address is not from their subnet. Poorly.
It should be
But is it possible to somehow change the order - first, to track which provider is needed, and then, based on this, choose next hop?
Can. For this, Linux has an interface to its connection tracker - conntrack match and CONNTRACK target. The first of these is the condition of matching a packet to a rule according to which only those packets will be processed in which a special label — the compound label — has a certain value. The second is a label management tool.
Connection labels differ from conventional packet labels (MARK) in that they are additionally maintained and maintained by the conntrack module. If we assign a connection label to a package, then this label will be found later on all packages belonging to the same connection. The connection tracking and label recovery occurs after processing the raw table (PREROUTING or OUPTUT chain), before processing the same chains of the mangle table, and storing the connection label in conntrack in the POSTROUTING chain after processing the nat table.
We can from the very beginning, even before our DNAT occurred on the internal server, assign a label to the connection, for example, depending on the interface through which this packet arrived at the router. After that, we will see all subsequent packages and responses with the same label - i.e. at any given time for any answer to know which interface it should exit.
Linux has RPDB - Routing Policy DataBase is the same as Cisco's called route-map - the routing policy database. At the same time, we create several different routing tables, and the choice of which of them will process the packet will be made on the basis of policies. The criteria for selecting a table can be: the interface through which our package entered; netfilter (nfmark); sender's address; destination address and others.
The netfilter (nfmark) tag is a suitable criterion for this case. This is not the same as a connection label (ctmark), but what prevents us from installing an nfmark based on ctmark? On the contrary, there is a special command in the CONNTRACK target for this.
Customization
To begin with, we will configure RPDB. I will not assign names to the tables - this is beyond the scope of the topic, especially since in this case the numbers will be clearer.
Conditional routing rules are added, set when the corresponding interfaces are raised, killed when stopped.
ip rule add fwmark 0x1 / 0x3 lookup 201
ip rule add from {ip-address associated with ppp1} lookup 201
ip rule add fwmark 0x2 / 0x3 lookup 202
ip rule add from {ip-address associated with ppp2} lookup 202
ip rule add fwmark - these are exactly the rules: if the label is 1, route through 201, if 2 - through 202. The other two rules are normal split-access (as described in LARTC). If none of the criteria matches, the routing will follow the default table (this is true for initial connection packets initiated from the local network or on the router).
In the routing tables we (also dynamically, when raising the interfaces) add the rules:
ip route add default dev ppp2 table 202
ip route add default dev ppp1 table 201
ip route add default dev ppp2 metric 2000
ip route add default dev ppp1 metric 1000
(if we don't have ppp, then the rules will be: default via {nexthopN} table 20N or metric 2000)
This, in fact, also according to LARTC. The metric sets the priority of the provider (the route with the minimum metric will be selected), but for the marked packets, tables 201 and 202 will be processed, respectively, each with one provider.
It remains to start marking packages.
# all packets containing connection labels - frontend identifiers
iptables -t mangle -N out-marking
iptables -t mangle -A PREROUTING -m connmark! --mark 0x0 / 0x3 -j out-marking
# if the package entered through any internal interface - copy the connection label to the package label
iptables -t mangle -A out-marking -i eth0 -j CONNMARK --restore-mark --mask 0x3
iptables -t mangle -A out-marking -i eth2 -j CONNMARK --restore-mark --mask 0x3
# all new connections
iptables -t mangle -N in-marking
iptables -t mangle -A PREROUTING -m conntrack --ctstate NEW -j in-marking
# packet entered via ppp1 - put the mark 1 so that the routing of any answers goes according to the table 201
iptables -t mangle -A in-marking -i ppp1 -j CONNMARK --set-xmark 0x1 / 0x3
# packet entered via ppp2 - put the mark 2, so that the routing of any answers goes according to table 202
iptables -t mangle -A in-marking -i ppp2 -j CONNMARK --set-xmark 0x2 / 0x3
It is necessary to pay attention to three points.
First, all processing takes place in the PREROUTING chain of the mangle table. This is because we want to mark packets (therefore mangle) before processing these labels during routing (therefore PREROUTING).
Secondly, the task of the connection label occurs only for the head packets (-ctstate NEW) - the others already have it. If a new packet does not have a connection label, then it doesn’t need to be serviced at all - it will follow the default table. This is true for all compounds initiated from lokalki.
Under the "numbers of providers" involved 2 bits, i.e. we can do 3 uplinks in this way (0 always remains under the “indefinite provider”, for such packets the default route will be used). Therefore, everywhere tags are written with a mask / 0x3 - our teams will not affect all other bits of the tags in any way, and they can be used for other purposes. For me, for example, other bits are used for traffic shaping.
And it can be different
We assign the second address to the internal server. For this address in rpdb, we assign the output through the second provider, and DNAT for the packets that come to us through the second provider, we do this second address.
Simpler? In a sense, yes. But you will have to maintain N addresses on the internal server (by the number of providers) and N DNAT rules.