Earlier, when it was necessary to distribute the load on several servers, we prescribed several A-records with the same name in the DNS zone, and everything worked. Customer requests were distributed approximately equally. This method of balancing was especially relevant for the distribution of static content.
Recently our client addressed us with a problem:
At CHNN, problems started loading different flash drives - the files were downloaded for several minutes.
The investigation revealed the uneven loading of one of the servers for distributing statics - he gave many times more traffic to the network than all the others. And periodically the load passed from one server to another.
In the DNS zone was written something like:
')
cdn.exampe.com IN A 192.168.10.1
cdn.exampe.com IN A 192.168.12.1
cdn.exampe.com IN A 192.168.15.1
cdn.exampe.com IN A 192.168.16.1
cdn.exampe.com IN A 192.168.11.1
cdn.exampe.com IN A 192.168.19.1
Previously, clients received responses from DNS servers using the round-robin algorithm, but searching for a problem led to a completely unexpected result!
Requests to the DNS server at 8.8.8.8 invariably returned the same address.
No round-robin. The returned address could be changed if the TTL ended, but it could not have changed.
Technical support of the corporation of good responded in the style of "we know better than you exactly what you need."
I have been getting the DNS roundup for getaddrinfo for IPv6 support:
homepage.ntlworld.com/jonathan.deboynepollard/FGA/dns-round-robin-is-useless.html
www.tenereillo.com/GSLBPageOfShame.htm
daniel.haxx.se/blog/2012/01/03/getaddrinfo-with-round-robin-dns-and-happy-eyeballs
RRDNS has never worked well, and then it will be even worse. And when IPv6 comes, it will be completely bad ...
This is the opinion of a good corporation, and no load graphs could convince it.
Further investigation of the problem showed that the problem is not only in Google, but also in Hetzner, in which the zone is hosted.
Problem number one:
Each of the Google servers sends the answers exactly the same as received from the DNS servers serving the zone, without changing anything in them. If each server receives its own sequence variant, then with successive requests it seems that Google gives addresses in a random order, although in reality these answers are given from a randomly selected server.
Problem number two:
Hetzner, in which the problematic DNS zone is hosted, began to render the list of addresses for the host unchanged.
And as a result, the same sequence of addresses appeared on all Google servers.
The response from TP Hetzner also did not please:
This function is not enabled on our DNS resolvers. If you want to use your own servers.
Of course, you can contact paid service providers CDN or tell programmers to generate links to content that point to specific servers (of course, for each request, the server in the link is taken randomly to distribute the load).
But with admins usually require to solve the problem quickly, and to work now.
And then you can look for a better solution.
We return a redirect to a specific server using nginx:
We will use the split_clients directive, distribute the percentages according to the power of our servers and register such a config on each of them.
Naturally, in the string hostname cdn1.example.com; we specify the unique name for each of servers.
http { split_clients "${remote_addr}AAA" $variant { 15% 1; 15% 2; 15% 3; 15% 4; 15% 5; 15% 6; * 7; } server { listen 80; server_name cdn.example.com; return 302 http://cdn$variant.example.com/$request_uri; } server { listen 80; server_name cdn1.example.com; location / { root /srv/www/cdn.example.com/htdocs; } }
As a result, nginx returns the user redirects to the server, which is determined based on the client's ip address hash.
Here is the load distribution.
PS: Translated the zone from Hetzner to Yandex