High website availability: site file geo-replication with lsyncd

High availability of the website - a joint work of the hosting provider and the site developer. The primary goal of high availability is minimizing planned and unplanned downtime.

High availability is more than just placing your project in a secure cloud. A truly highly accessible site should operate in several cloud regions and its users should not notice any changes even if one of the cloud regions becomes unavailable. The website developer must ensure that the site is up and running even in an emergency. High availability systems are duplicated: if the provider fails, the site will be available. If a user fails to replicate, the site should also be available. If you need to work on the server developer or reload it - users should not notice this.

')
In this series of articles we will look at how to organize the high availability of various subsystems of your site. Many tasks have different solutions. The author does not claim that the best solution is presented here, but it is quite efficient and tested in practice. However, the field for experiments to increase accessibility is huge.

Today we will look at syncing a static site between cloud regions: changes in files on one of the servers should appear on the other. We will also consider the simplest way to redirect the users of your site to an alternative server using several A-DNS records, applicable for this case.

Lsyncd

Lsyncd (Live Syncing Daemon) is an application for timely interactive mirroring of these servers for use in high-availability clusters. Especially good lsyncd is suitable for systems with low synchronization traffic. The application collects information about data changes through the Linux inotify kernel subsystem for a period specified in the configuration and starts the process of mirroring changes (via rsync by default, but there are other options). By default, lsyncd runs as a daemon in the background and logs its actions using syslog . For testing purposes, you can run the application without demonization to see what is happening in the terminal for debugging.

Lsyncd does not use a separate file system or block device and does not greatly affect the performance of the local file system.

Using the rsync + ssh option allows you to transfer files directly to the target directory instead of transferring the location to a remote server.

Installing and configuring geo-replication of web server files

Access to different regions of InfoboxCloud

Order 2 subscriptions to InfoboxCloud in Moscow and Amsterdam to create a geo-distributed solution.
In order for subscriptions to be tied to a single user account, you need to act as follows:
1. Go to http://infoboxcloud.ru and order cloud infrastructure in any region (for example, in Amsterdam). Then go to the control panel and order a cloud in another region (for example, in Moscow), as shown below.

After ordering, exit the control panel and login again. Now you can select the region in which the work takes place in the upper right corner of the control panel:

Create 2 servers: one in Moscow, the other in Amsterdam.

As the operating system, select CentOS 7. This article discusses it, but you can use another Linux operating system if necessary. However, the settings may differ. You can use any type of virtualization to choose from. The difference for a particular scenario is that if you don’t check the box “allow OS kernel management”, you can use memory auto-scaling for servers, which will allow you to use resources more efficiently. And if you install , you can configure the inotify kernel subsystem, which will be useful under high loads ( configuration example ), but it does not make sense for a regular small site. When creating each of the servers, be sure to add one public IP address so that the servers can access from the external network:

After creating the servers, the access data will come to your email.

DNS setup

For the main domain, the site on which should be highly accessible, create two DNS DNS records pointing to a server in Moscow and a server in Amsterdam. In our case, the site will failover.trukhin.com .

Create service subdomains whose A-record should point to your server. For example, failovermsk.trukhin.com points to a server in Moscow , and failoverams.trukhin.com points to a server in Amsterdam . Separate subdomains for each of the servers are needed in order, in case the server goes down, to deploy another replica from the backup server and redirect the subdomain to it.

Server setup

The steps below should be done on both servers.
Connect via SSH to both servers. Install Apache on each of the servers, run it and add it to autoload:

yum -y update && yum install -y httpd && systemctl start httpd.service && systemctl enable httpd.service

Create an index.html file in the / var / www / html directory of each server and make sure that the page opens correctly in the browser from each of the servers.

 <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title> Hi </title> </head> <body> Hello, World! </body> </html>

For lsyncd to work, you need to provide access to each of the servers for each other's keys.
Generate an SSH key (for questions, you can just press Enter):

 ssh-keygen

From a server in Moscow, add a key to a server in Amsterdam:

 ssh-copy-id root@failoverams.trukhin.com

From the server in Amsterdam, add the key to the server in Moscow:

 ssh-copy-id root@failovermsk.trukhin.com

Now connect from the server in Moscow (root@failovermsk.trukhin.com) to the server in Amsterdam (root@failoverams.trukhin.com) and vice versa. Password should not be requested. When connecting, answer yes.

Install and configure lsyncd

The steps below should be done on both servers.

To install lsyncd on CentOS 7, add the EPEL repository with the command:

 rpm -ivh http://mirror.yandex.ru/epel/7/x86_64/e/epel-release-7-5.noarch.rpm

Now install lsyncd:

 yum install lsyncd

Create a directory for storing the lsyncd logs and temporary files:

 mkdir -p /var/log/lsyncd && mkdir -p /var/www/temp

Create a lsyncd configuration file at: /etc/lsyncd.conf

 settings { logfile = "/var/log/lsyncd/lsyncd.log", statusFile = "/var/log/lsyncd/lsyncd.status", nodaemon = false } sync { default.rsyncssh, source="/var/www/html", host="failovermsk.trukhin.com", targetdir="/var/www/html", rsync = { archive=true, compress=true, temp_dir="/var/www/temp", update=true, links=true, times=true, protect_args=true }, delay=3, ssh = { port = 22 } }

Value host: failovermsk.trukhin.com replace with a subdomain directed only to a server in another region. The source indicates the folder on the current server. The targetdir specifies the folder on the remote server. The delay parameter is the period over which the changes will be synchronized on the server. This value is chosen experimentally, the default value is 10. Fully all the parameters of lsyncd can be found in the official documentation .

To debug, set the parameter nodaemon = true and save the changes. Create a file on one of the servers :

 touch /var/www/test

Run lsyncd manually to verify that everything is synchronizing correctly.

 lsyncd /etc/lsyncd.conf

If everything went well - on a server in another region in the / var / www / html folder, you will see the test file created.

Now return the value of nodaemon = false in /etc/lsyncd.conf. Add lsyncd to autoload and start the service:

 systemctl start lsyncd.service systemctl enable lsyncd.service

Make sure the data is replicated after reboot.

2-way replication

For replication to work in the opposite direction - on the server in another region, make the same settings, but in the lsyncd configuration file, specify the address of the first server. Check that the data is replicated in the opposite direction. The lsyncd configuration already contains a temporary temp_dir directory, the use of which is necessary for two-way synchronization.

It is not always necessary to replicate to 2 sides , since in mysql it is not recommended to use master-master replication and in the event of the first server failing when using such a database, you will need to configure replication from the second working server to the third. This is done simply if we prepare in advance the backup server template for the cloud, which will be discussed in subsequent articles. Today we work only with files and for a static site two-way replication is quite applicable.

Check the availability of the site

Let's go to our website:

Both servers are available.

Next, turn off the server in Moscow.

Our site is available:

Turn on the server in Moscow and turn off in Amsterdam:

Our site is available:

Why does this work?

Modern browsers, if there are several A-records, first try to access one ip – address, and if it is not available, another. Thus, if at least one server is available, the site will work.

This approach in a more complex form has been used by large sites for a long time, and a lot of backup servers have been made for them. For example, Google has 11:

Conclusion

In this article, we looked at how to configure geo-replication for a static site without a database. In subsequent articles, we will look at how to replicate a database and provide high availability for more complex sites.

If you find an error in the article, the author will gladly correct it. Please write in the LAN or in the mail about it. If you can not leave comments on Habré - write them in the InfoboxCloud Community .

Successful work!

Source: https://habr.com/ru/post/252751/

All Articles