UPD . Due to the fact that many commentators do not read completely, I will write here a brief summary of the problem: for .RU, Glue Records is not cleared when the domain delegation changes. At a minimum, for domains managed via Ru-Center and Reg.ru. The article itself is a “life story” about what problems such a “feature” of the .RU zone can cause, as they were diagnosed and resolved.
“One friend of mine” (c) told a story about his adventures with DNS.
Prehistory
Imagine that 2016 is ending, there are several hundred domains on your DNS servers, 5 powerful DNS servers were launched a long time ago in different data centers, you want to finally get rid of the old DNS servers, which recently have just been hiring money for rent and take up space in the rack. However, immediately after trying to turn them off, support phones turn red hot, so old servers have to be quickly turned on and thoughtfully understood.
')
Chapter 1
Once again, we check the DNS zones, make sure that there are no old addresses anywhere and a very long time. We check from different regions with requests for each of our NSs. Everywhere everything goes right. Take tcpdump and look at requests for port 53 on older DNS. It turns out that there are a lot of requests for them. And this despite the fact that the IP addresses of these servers for many months nowhere appear!
We look at tcpdump on new DNS, and there are noticeably less requests than on old ones. Just miracles!
After analyzing the requests to the old DNS, we take top10 client addresses and find out that this is BIND9 (responds to dig + short version.bind txt chaos @ server) of large domestic providers (Rostelecom, TTK, etc.).
Chapter 2
We kindly ask several colleagues to cut client domains from the home Internet through the provider's DNS server (
dig any clientsite.ru. @ns1.provider.ru
) and get a response with incorrect IP addresses and a huge TTL 345600 (4 days !!!) in ADDITIONAL SECTION. Moreover, if you pass through the name to which the domain is delegated, the provider's DNS server begins to respond with the correct IP and correct TTL, but the “happiness” ends together with the expiration of this TTL. The situation is reproduced in 6 cases out of 10, where the provider uses BIND9 as a recursor. We assemble a test site with the latest version of BIND9 on debian and ubuntu, the situation is reproduced, restarted, etc. do not help.
Chapter 3
Suddenly the idea comes to request the root servers of the .RU zone (currently it is a.dns.ripn.net., B.dns.ripn.net., Etc., the list can be obtained with the command
dig ns ru.
). And we get the same answer with wrong IP addresses, TTL 345600 in ADDITIONAL SECTION. The problem managed to localize, now you need to understand how it originated.
Chapter 4
We recollect the theory about DNS, we read just in case
an article in Wikipedia about the gluing of zones (Glue Records) . Since Glue Records has not been used for a long time (5-6 years), we arrive at the wrong conclusion: one of the clients managed to specify the old IP addresses of the DNS when delegating a domain. I had to write scripts that requested whois for analysis on the list of client domains.
Unfortunately, because of this incorrect assumption, time was lost, but other errors were found. For example, clients love domain mail from Yandex and manage to delegate a domain to our NS and NS to Yandex at the same time.
Chapter 5
The next morning, after analyzing once again collected whois data, we find nothing. It remains only to contact RU-Center with diagnostics and our assumptions. Despite the long-term cooperation with this company, we practically did not have any calls to technical support, mostly questions arose regarding the accounting part. Therefore we send the letter to support @ and we wait for the answer. After a couple of hours of waiting for an answer to email, our employees can no longer tolerate and are trying to make phone calls. Having listened to a sufficient amount of music, we fall on a living person who finds our letter and reports the ticket number. Hooray! However, we do not receive a response on the application. A few more calls at intervals of 1-2 hours do not bring the result closer. Fortunately, at the end of the working day we get something like this:
The existence of glue records means that previously, subordinate DNS servers with these IP addresses were specified for the domain. To change these records, you must delegate the domain to the same child DNS servers with the indication of new IP addresses.
Very surprised. The last time glue records were many years ago. Is it possible? We are raising the mail archives, we find letters there for 2011, when the domain has been delegated to other ns and again we are surprised. It turns out that for so many years it worked incorrectly and no one noticed anything!
Chapter 6
In the meantime, the last hours of renting old glands are coming, and we understand that with TTL in 4 days in provider caches, we will need to pay another month of rent, and this is not exactly what was planned.
Please support Ru-Center to remove glue records. We don't need them at all, why should we bind to new IP addresses ?! Moreover, in our DNS zone for each NS, several IP addresses + IPv6 are indicated. And we get an answer that puzzles us. Something like this:
the application has been submitted for execution to the system administration department, the result will be reported
Those. we conclude that the registrar doesn’t have a ready mechanism for this operation (we thought it was a button in the interface of the support employee), which means that either no one encountered a similar problem (which is unlikely), or simply changed the IP addresses into glue records and lived farther. But this is wrong, because it is logical that if the IP addresses for NS are deleted from the registrar, then glue records should also be deleted. We assume that these are old glitches, the beginnings of this decade have long been resolved, and probably we just had no luck.
Chapter 7
By noon of the next day we are cautiously interested in email, is there an answer to the request? After we pass several times to the quest with listening to music on the phone, we dial into living people, but other than "your application has been submitted to the system administration department" we cannot achieve anything. It takes almost a day, we have no choice but to follow the advice with the indication of new glue records, which we are doing. After a few hours, the changes take effect (as you remember, the .RU is not updated very often) and we delete these glue records (we don’t need them and even get in the way). We are waiting for the next .RU update, however, the root .RU servers continue to reply with these entries, but with new IP addresses. Is there a glitch left? !!!
Chapter 8
Since there is no answer on the application, the next day we again complete quests with telephone music. I would especially like the level when they asked to be connected with the head of support and the girl warned that “we will have to wait some time” and put on us music that stopped after 30 minutes and the call was broken. Fortunately, towards the end of the working day, the answer finally came that the glue record was deleted. Cheers, comrades!
Chapter 9
Recalling the TTL in 4 days in provider caches, we look at tcpdump. Indeed, the requests to the old DNS servers are getting smaller, i.e. everything goes as planned. We can only wait. Just in case, we take a couple of unnecessary domains and try to reproduce the situation. And she repeats!
For the story I will leave here a recipe how to reproduce the situation:
We take the first domain ingavto.ru, create in it A records for ns10.ingavto.ru and ns11.ingavto.ru, pointing to the IP addresses of two DNS servers and delegate to them this domain with IP addresses from the registrar (glue records). We are waiting for the .RU zone to be updated.
We take the second domain body-m-auto.ru, delegate it to ns10.ingavto.ru and ns11.ingavto.ru, we are waiting for the update.
We delegate the first domain to
any other NSs , in the zone we change the address in A records for ns10.ingavto.ru and ns11.ingavto.ru, we are also waiting for the update.
As a result, we have incorrect glue records and glitches in the root zone .RU, which are described above.
Request the IP addresses of the NS servers from Google:
$ dig +short A ns10.ingavto.ru. @8.8.8.8 136.243.55.209 $ dig +short A ns11.ingavto.ru. @8.8.8.8 136.243.55.194
The answer is the same as what is written in the DNS zone. An example of a request to the root server, it is clear that completely different IP addresses are given:
$ dig any body-m-auto.ru @a.dns.ripn.net. ; <<>> DiG 9.10.3-P4-Ubuntu <<>> any body-m-auto.ru @a.dns.ripn.net. ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 269 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 3 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;body-m-auto.ru. IN ANY ;; AUTHORITY SECTION: BODY-M-AUTO.RU. 345600 IN NS ns11.ingavto.ru. BODY-M-AUTO.RU. 345600 IN NS ns10.ingavto.ru. ;; ADDITIONAL SECTION: ns10.INGAVTO.RU. 345600 IN A 89.249.20.188 ns11.INGAVTO.RU. 345600 IN A 89.249.24.177 ;; Query time: 42 msec ;; SERVER: 2001:678:17:0:193:232:128:6
Epilogue
Unfortunately, it was not possible to find out, this problem
is connected only with the Ru-Center or is relevant for other registrars . If you have a couple of domains for experiments purchased through another registrar, please test and write in a comment.
It is very likely that such a problem may exist
not only for the .RU domain, but also for the .RF domain , but we also do not have such domains for the test now.
It also suggests an obvious conclusion that at present you should not use glue record for the .RU domain and in some cases it makes sense to contact registrars with a request for stripping such records.
PS It would be nice if someone from the experts working in the Ru-Center or another registrar would comment on the situation.