Inside one data center, fault tolerance is easy to organize - there are a lot of tools and techniques.
And what if you need to organize fault tolerance on the basis of several data centers?
Below I will give, in my opinion, an elegant and very cheap solution, of course not without drawbacks.
The point is that each data center has its own NS server that gives the IP of its data center.
')
Now in pictures, imho so clearer and clearer ...
And so, what happens when the browser tries to open a web page (simplified version):
If DNS does not respond, then the dns client accesses the following ns server:
Zone settings for each data center.
Here it can be seen that in some data centers fronts can be more than 1.
In general, I talked about the idea. And from it you can wind a lot of interesting things.
Advantages:
- If the data center falls within a minute, all customers will go to working sites.
- If you need to carry out maintenance work - turn off the named, wait a minute, you can work.
Disadvantages:
- A very small part of customers will still break into the "off" data center.
- It is necessary to maintain a separate zone file for each data center, but this task is easily solved using, for example, puppet.
- Not exactly evenly distributed load, but tolerable
PS Be sure to set in the zone file:
$ TTL 60; 1 minutes