Balancing incoming connections on iptables

Suppose you have a service that accepts incoming connections and the problem of load balancing and / or fault tolerance arises.

In general, the scheme looks like this:

----> ---> ()

There are a lot of ready balancers for specific needs. For example, nginx is an excellent balancer for web applications, haproxy for tcp connections.
')

You are not looking for easy ways.
You are bored and want something new
Only iptables, only hardcore

In fact, it all depends on the specific situation, so let me just write a recipe and along the way you decide for yourself where it might be useful.

Recipe

For balancing we will use the statistic, condition and labeling modules.
Not all of these modules are by default. In older versions of linux, look for nth instead of statistic, and you may have to condition the hands from the project site

Now suppose your service is an smtp server.
External ip-address let it be 10.0.0.1
Internal ip-addresses of services will be 192.168.0.1 and 192.168.0.2

Create the following rule set in the mangle table.

* mangle
: MAILMARK - [0: 0]

-A MAILMARK -j CONNMARK --restore-mark
-A MAILMARK -m mark --mark 0x0 -m statistic --mode nth --every 2 -m condition! --condition mail1up -j MARK --set-mark 1
-A MAILMARK -m mark --mark 0x0 -m condition! --condition mail2up -j MARK --set-mark 2
-A MAILMARK -m mark --mark 0x0 -m condition! --condition mail1up -j MARK --set-mark 1
-A MAILMARK -j CONNMARK --save-mark

-A PREROUTING -d 10.0.0.1 -p tcp --dport 25 -j MAILMARK

COMMIT

and in the nat table, the following:

-A PREROUTING -d 10.0.0.1 -p tcp --dport 25 -i bond0.204 -m mark --mark 1 -j DNAT --to-destination 192.168.0.1
-A PREROUTING -d 10.0.0.1 -p tcp --dport 25 -i bond0.204 -m mark --mark 2 -j DNAT --to-destination 192.168.0.2

That's all, now it remains to explain how it all works.

First, any incoming packet to the external address 10.0.0.1 for port 25 goes through the mangle table and the labeling rule

-A PREROUTING -d 10.0.0.1 -p tcp --dport 25 -j MAILMARK

For marking, a separate MAILMARK chain has been created, which works as follows.
If the connection has already been marked, then mark the package with the same value as the connection.

-A MAILMARK -j CONNMARK --restore-mark

Thus, the packets of one connection will later be transferred to one backend of the service.
If the package is not marked, then mark it with the following rule.

-A MAILMARK -m mark --mark 0x0 -m statistic --mode nth --every 2 -m condition! --condition mail1up -j MARK --set-mark 1

-m mark --mark 0x0

the condition that the packet is not yet marked

-m statistic --mode nth --every 2

We mark every second package. Those. we balance the load very simply 50/50, but this is not the only possibility (see random and statistic)

-m condition! --condition mail1up

condition that the backend is live

-j MARK - set-mark 1

label the package with a value of 1

If the package remains unmarked, then label it with a 2 label, provided that the backend is live

-A MAILMARK -m mark --mark 0x0 -m condition! --condition mail2up -j MARK --set-mark 2

If the packet is still not marked, then the second backend is apparently dead, and the first label was not set, since either we are unlucky with the 50/50, or the first backend also died. In case the first backend is still alive, we mark the package with a label 1

-A MAILMARK -m mark --mark 0x0 -m condition! --condition mail1up -j MARK --set-mark 1

The last rule to remember for the current connection is a tag.

-A MAILMARK -j CONNMARK --save-mark

After the packet is marked, it will be translated to the selected backend by the rules that were added to the nat table.

It remains to illuminate another point related to how iptables will understand that the backend is live.
No way, you should do this yourself and put 0/1 in the / proc / net / nf_condition / mail1up and / proc / net / nf_condition / mail2up files, depending on whether the backend is alive or dead.
By default, when starting iptables, these files will be 0.
The rules that are given are calculated on the fact that 0 means that the backend is live.
For example, to check the smtp server I use the following bash script

 #!/bin/bash get () { response=$(echo "QUIT" | nc -w 5 $1 25 | head -1 | awk '{print $1}') if [ "$response" == "220" ] then echo "0" else echo "1" fi } echo $(get 192.168.0.1) > /proc/net/nf_condition/mail1up echo $(get 192.168.0.2) > /proc/net/nf_condition/mail2up

Conclusion

On the one hand, there are probably no compelling reasons to do this, but on the other hand, you can think of them.

iptables runs at the kernel level and is likely to be faster than any other balancer
iptables is more secure. For example, it’s not a fact that when a failover happens between the balancers, the same haproxy will start successfully, since for start he needs much more conditions. For example, the IP address 10.0.0.1 should be intercepted and only after that it will be possible to run haproxy on port 25. For example, someone during this time corrupted his configuration file and it will not start at all.

Source: https://habr.com/ru/post/173713/

All Articles

Balancing incoming connections on iptables

Recipe

Conclusion

More articles: