Let's take a classic example as a distributed hosting - creating a blog service based on mu-Wordpress.
The task is to build a fault-tolerant (as far as possible) geo-distributed system with a limited budget. Accordingly, all the equipment is taken to the rental in various data centers.
And here it should be said that not all Data Centers are equally useful. High-quality rent a server for $ 800, and for low-quality ones, it is possible to take about the same server for rent already for $ 100. And it is these features that must be considered when creating a geocluster.
Now about the small hacks. By default, in mu-Wordpress, the upload function of downloadable content is extremely unfortunate - via PHP. Accordingly, it was replaced by the download by a separate service and the insertion of downloadable content by a direct link to the statics.
The second hack was a modification of the cache control. In addition to the instructions to cache static design elements, another hack was introduced that prohibited the cache from being written for the duration of its discussion (14 days by default), and after that it was given with a header allowing caching. In addition, RSS feeds were cunningly cached.
The final hack was the database synchronization system - each INSERT / DELETE / UPDATE was performed on the “neighbor”. It turned out such a soft raid in the context of MySQL + PHP.

First about the DNS. Since mu-Wordpress used a subdomain for each blog, the most reasonable solution was to use the Slave DNS service from two independent registrars - gray clouds. It is inexpensive and completely reliable.
')
The high-quality Green Data Center, where two servers are rented, was used for primary DNS, initial registration services and a fill form for downloadable content.
Two data centers of average quality were used as the main “backbone” of the geocluster. Each server had its own neighbor and they synchronized between themselves all the information of the database + files.
Thus it was possible to significantly reduce costs at this stage. However, after a while there was a serious problem, the name of which - search bots.
These bots created 200-300 simultaneous connections to each server, which of course didn't lead to anything good - timeouts and 50x errors started. Of course, it would be possible to reduce the number of requests through robots.txt with the crawl-delay option, but ... who wants to have his blog indexed slowly?
And here cheap yellow Data Centers + setting up monitoring and DNS on two servers in the green Data Center helped. How it all worked:
Two yellow data centers functioned as primary (parent) squid proxies. The remaining three used them as parental ones, which reduced the load on the blue “backbone”.
Monitoring on servers in the green Data Center monitored the availability of yellow and blue, performing a DNS modification in the event of a failure.
Now let's talk about fault tolerance. What did we lose when turning off the green, blue or yellow segments?
- Green - register and download files
- Blue - in the case of one thing, in the case of all - the admin panel did not work + the last entries outside the cache were not displayed.
- Yellow - in the case of one nothing, in the case of all - just an increased load on the blue server
Thus, it was possible to achieve maximum availability + saving the blog in the proxy cache, even with its physical removal. It was and is - one user accidentally deleted his blog - and this feature is provided. Turning in time to support his notes were pulled out of the cache, parsed and poured into the newly created blog.
PS For server optimization under mu-WordPress, you can write to me in the comments - quite a lot of dogs were eaten by this case;)