Below is a translation of an article about the problem of working with udp in network applications. The translator allowed himself to change the examples: in the source text, other network addresses and ruby ​​code. The translation used a simple script on the pearl. The essence of the problem and the solution does not change.
In addition, my comments have been added in some places (in brackets, in italics).
The picture for attracting attention is taken from the text of the wonderful book “
learnyousomeerlang.com ”
Heavy work light protocols
Sometimes it begins to seem that protocols without setting up the connection do not justify all the commotion that is caused.
For example, let us analyze the situation with the response when a UDP datagram with the initial request is sent to an additional IP address on the interface (alias or secondary IP).
There is an eth1 interface:
$ ip a add 192.168.1.235/24 dev eth1 && ip a ls dev eth1 2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:30:84:9e:95:60 brd ff:ff:ff:ff:ff:ff inet 192.168.1.47/24 brd 192.168.1.255 scope global eth1 inet 192.168.1.235/24 scope global secondary eth1 inet6 fe80::230:84ff:fe9e:9560/64 scope link valid_lft forever preferred_lft forever
')
How does the code usually look for getting a package by udp? Well, the echo server may look like something very similar to what is under the cut:
This is a fairly simple script on pearl, which will show from whom the udp package came from, the contents of the package and send this package back to the sender. There is simply no place. Now let's start our server:
$ ./echo_server.pl Waiting for data...
Let's see what he listens to:
$ netstat -unpl | grep perl udp 0 0 0.0.0.0:5000 0.0.0.0:* 9509/perl
And after that, we connect from a remote machine to our server by the main IP:
-bash-3.2$ nc -u 192.168.1.47 5000 test1 echo: test1 test2 echo: test2
How it looks in tcpdump on our machine (well, or should look like):
-bash-3.2$ tcpdump -i eth1 -nn port 5000 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes 17:41:00.517186 IP 192.168.3.11.44199 > 192.168.1.47.5000: UDP, length 6 17:41:00.517351 IP 192.168.1.47.5000 > 192.168.3.11.44199: UDP, length 12 17:41:02.307634 IP 192.168.3.11.44199 > 192.168.1.47.5000: UDP, length 6 17:41:02.307773 IP 192.168.1.47.5000 > 192.168.3.11.44199: UDP, length 12
Just fantastic - I send the package and get the package back. In netcat, we get back what we would not print (funny effect if typing "arrows").
And now the same to the secondary address on the same interface:
-bash-3.2$ nc -u 192.168.1.235 5000 test1 test2
How crazy it looks in tcpdump this time:
-bash-3.2$ tcpdump -i eth1 -nn port 5000 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes 17:48:32.467167 IP 192.168.3.11.34509 > 192.168.1.235.5000: UDP, length 6 17:48:32.467292 IP 192.168.1.47.5000 > 192.168.3.11.34509: UDP, length 12 17:48:33.667182 IP 192.168.3.11.34509 > 192.168.1.235.5000: UDP, length 6 17:48:33.667332 IP 192.168.1.47.5000 > 192.168.3.11.34509: UDP, length 12
And of course, no self-respecting network stack is going to accept packets from a completely unfamiliar address, even if the ports are correct. Thus, the client will never receive reverse packets and will think that the server simply discards its requests.
What is happening at first glance seems complete nonsense. But in fact, this is a common flaw for a protocol without setting up a session, such as UDP. You see, our socket listens to any address (the empty LocalAddr parameter when creating a socket is passed to the system as an address like "0.0.0.0", any available, which causes the socket to listen on all available addresses. And no, I also don’t know why This is not a particularly intuitive action). When we receive a packet in our application using socket-> recv (), we do not receive information about the specific address to which the packet was sent. We only know that the operating system decided that the package was for us (here you have encapsulation). All we know is where the package came from. And due to the fact that the kernel does not store any information about connections for the socket (the kernel is logical, asked without connections — it will be without connections), when it comes time to send the package back, all we can do is tell “where” to send the package. (In Perl, this is done implicitly. The address and port of the sender of the datagram are associated with the object $ socket, so you do not need to specify it in the send call).
But the real brainwash begins when we try to put the sender's address in the response datagram. Once again: the kernel does not store any information about the sender or recipient, since we work without connections. And since we listen to "any" interface, the operating system thinks that it has a blank check to send a packet from the address that it "likes." In Linux, it seems, the main address of the interface from which the packet will be sent is selected.
(Actually, the address is determined in accordance with RFC1122, 3.3.4.2 “Multihoming Requirements” , according to the routing table - the note of the translator) . So for the common case “one address - one interface” - everything works. But as soon as it comes to less common situations, nuances begin to appear.
The solution is to create sockets that listen to specific addresses. And send a packet from these sockets: the kernel will know from which address you want to send packets and everything will be fine. Looks simple enough, huh? And of course, any sane network application already does that, huh? So it is obvious that the implementation of UDP in Ruby is just crap
(in the original Ruby sources, - the note of the translator;) . That is how I thought at the beginning, and I do not blame you if you thought the same. But while the RUBICON of the war with Ruby's UDPSocket authors has not been transferred, let's do a little experiment with other frequently used applications. For example, SNMPd. The daemon from the net-snmpd package in ubunt is subject to the same problem as our test application above. It does not seem that this is some kind of new rake, which has only been stepped on and scattered with a bunch of patches to correct.
So in general, everyone suffers the same "disease." By “all” is meant “some UDP servers.” There is a certain amount of software that is not subject to a similar problem with aliases on interfaces. Bind comes to mind immediately and NTPd works fine, if running after you have configured all interfaces. What is the difference? The difference is that these services are somewhat “smarter” and bind to all addresses in the system separately. On the example of bind:
$ netstat -lun |grep :53 udp 0 0 192.168.1.47:53 0.0.0.0:* udp 0 0 192.168.1.47:53 0.0.0.0:* udp 0 0 127.0.0.1:53 0.0.0.0:*
This is very cool and solves the problem. The exception is when you add an extra alias after the demon has started. Bind will not pick up a new address and you will have to restart the server. In addition, it complicates the code somewhat, since you have to deal with a bunch of sockets inside the program (for example, use select () instead of simply blocking on the reception attempt.) In general, no one likes the extra complexity, but you can cope with this . However, the real problem is the rule “do not add addresses after the start of the daemon”. The need to check whether the system has ip-addresses added, and restarting the service after adding the address will become a real problem.
However, there is some workaround for this problem. Here we recall the ntpd. The ports that he listens are as follows:
$ netstat -nlup | grep 123 udp 0 0 192.168.1.235:123 0.0.0.0:* udp 0 0 192.168.1.47:123 0.0.0.0:* udp 0 0 127.0.0.3:123 0.0.0.0:* udp 0 0 127.0.0.1:123 0.0.0.0:* udp 0 0 0.0.0.0:123 0.0.0.0:* udp6 0 0 fe80::230:84ff:fe9e:123 :::* udp6 0 0 ::1:123
NTPd listens to each address individually and additionally listens to any address available to the system. I do not know exactly why this is necessary. If you just listen to each address separately, then everything will be fine, as is the case with the bind. But if you add another address to the interface after the start of ntpd, then the same problem begins to appear as in the case with the udp-echo server. So I do not think that listening to "any" interface gives any plus. However, this causes ntpd to behave somewhat differently from Bind: when you send a packet to an interface added after Bind starts, it simply ignores you (it does not have a socket that would listen to your requests). Ntpd tries to send a response and suffers from the problem of the wrong address in the answers.
(But you can change the primary addresses on the interfaces and create new interfaces, a translator's note).
At the moment, the best solution seems to follow the path of Bind and ntpd and listen to all addresses individually with the “focus” from ntpd: listen additionally and at 0.0.0.0. At the same time, if I received the package at 0.0.0.0, then I need to run scans of the addresses available in the system and bind more on them. This should solve the problem.
It remains only to make it work (and solve a bunch of problems that will surely come out on the way). Wish me good luck. The cries of pain and torment that you hear (it doesn't matter where you are) are certainly mine.
UPD: an interesting explanation from
Quasar_ru appeared in the
comments . All the same, the implementation of UDP in scripting languages ​​is ambiguous: on pure C you can write a client application that can receive a response from the server from another address. The benefits of such an implementation are controversial, but implementation is still possible.