Your cloud hosting in 5 minutes. Part 2: Service Discovery

Hi Habr! In the previous article, I explained how to build your cloud hosting in 5 minutes using Ansible , Docker and Docker Swarm . In this part, I will talk about how services running in the cloud find each other, how the load balancing between them occurs and their fault tolerance is ensured.

This is an introductory article, here we will focus on a review of the tools that will solve the problem of “service discovery” in our cloud. In the next part we will begin to practice, so I decided to give you time to get acquainted with them.

Content

Part 0: Virtualization
Part 1: Ansible, Docker, Docker Swarm
Part 2: Service Discovery
Part 3: Consul, Registrator, Consul-Template
...

Problem

Let's analyze the most typical problem and its common solution - we have a web application and we must ensure load balancing and its fault tolerance.
')
We can run multiple copies of our web application that Supervisor will monitor. Supervisor will restart our web application if any errors occur, and will also add such events to the log. The problem of load balancing will solve the installation of Nginx . Nginx configuration will look something like this:

upstream app { server 192.168.1.2:8080 max_fails=3 fail_timeout=5s; server 192.168.1.2:8081 max_fails=3 fail_timeout=5s; server 192.168.1.2:8082 max_fails=3 fail_timeout=5s; } server { location / { proxy_pass http://app; health_check; } }

This configuration will work as follows - if within 5 seconds the number of unsuccessful attempts to access one of the web applications reaches 3, then such an application will be marked as inoperative for 5 seconds ( if it fell with an error, then Supervisor restarts it ). Thus, the entire load is evenly distributed between the working copies of the applications.

disadvantages

In fact, this is a good configuration and if you have a few applications and the load is more or less even, then it is better to use it.

But we are building a cloud where this or that application will be launched - we do not know. Our load may vary for different sites / web applications in different ways, so it would be nice to be able to change the number of running copies of our applications depending on the situation. In other words - we cannot configure Nginx / Apache / etc in advance for such a configuration.

It would be cool if Nginx and our other services adapted to the dynamic nature of our cloud. We will deal with the solution of this particular problem in this article.

Requirements

We need a place where our services will be able to register themselves and receive information about each other. Docker Swarm , which we started using in the previous article , out of the box knows how to work with etcd , Consul and Zookeeper .

We need our services to be automatically registered and removed from the above systems ( we will not teach this for each application ). For these purposes, we use Registrator ( we will consider it in more detail below ), which works out of the box with Consul , etcd and SkyDNS 2 ( Zookeeper support in the plans ).

Our services should be able to find each other using DNS queries. Consul and SkyDNS 2 ( which works together with etcd ) can solve this problem.

Monitoring the health of the services we also need. It is available to us at Consul ( which we will use ) out of the box and is supported by the Registrator ( it must transmit information about how this or that service should be monitored ).

Last but not least, you need a service to automatically configure our components. If we run 10 copies of one web application and 20 copies of another, it must understand and immediately react to it ( changing the configuration of Nginx, for example ). This role will be played by the Consul Template ( we will consider it in more detail below ).

Note

As you can see, there are different solutions to our problem. Before writing this article, I worked on my configuration for a little over a month and did not encounter any problems.

Consul

Of the above options ( Consul , Zookeeper , etcd ), Consul is the most independent project able to solve our problem of finding services out of the box.

Despite the fact that Consul , Zookeeper and etcd are located here in the same row, I would not compare them with each other. All 3 projects implement distributed key / value storage, and this is where their common features end.

Consul will provide us with a DNS server, which is not in Zookeeper and etcd ( can be added using SkyDNS 2 ). Moreover, Consul will give us health monitoring ( which neither etcd nor Zookeeper can boast of ), which is also necessary for a full-fledged Service Discovery.

With Consul we get the Web UI (the demo of which can be seen right now ) and high-quality official documentation .

Note

Even if you plan to use the same configuration that I describe and the use of Zookeeper and SkyDNS 2 is not in your plans, I would still be familiar with these projects.

Registrator

Registrator receives information from Docker 's about starting / stopping containers ( through a socket connection using the Docker API ) and adds / deletes them to / from Consul 'a.

Registrator automatically obtains information about a particular service based on published ports and from the Docker container environment variables. In other words - it works with any containers that you have and requires additional configuration only if you need to redefine the parameters obtained automatically.

And since all of our services work exclusively in Docker containers ( including the Registrator itself ), then Consul will always have information about all the running services of our cloud.

This is all cool, of course, but even better is what Registrator can tell Consul 'how to check the health of our services. This is done using the same environment variables.

Note

Consul can check the health of services if the Consul Service Catalog ( which we use ) is used to save information about them.

If Consul Key-value Store is used ( which is also supported by the Registrator and uses, for example, Docker Swarm to save information about Docker nodes ), there is no such function.

Let's look at an example:

 $ docker run -d --name nginx.0 -p 4443:443 -p 8000:80 \ -e "SERVICE_443_NAME=https" \ -e "SERVICE_443_CHECK_SCRIPT=curl --silent --fail https://our-https-site.com" \ -e "SERVICE_443_CHECK_INTERVAL=5s" \ -e "SERVICE_80_NAME=http" \ -e "SERVICE_80_CHECK_HTTP=/health/endpoint/path" \ -e "SERVICE_80_CHECK_INTERVAL=15s" \ -e "SERVICE_80_CHECK_TIMEOUT=3s" \ -e "SERVICE_TAGS=www" nginx

After a similar launch, Consul 's list of our services will look like this:

 { "services": [ { "id": "hostname:nginx.0:443", "name": "https", "tags": [ "www" ], "address": "192.168.1.102", "port": 4443, "checks": [ { "script" : "curl --silent --fail https://our-https-site.com", "interval": "5s" } ] }, { "id": "hostname:nginx.0:80", "name": "http", "tags": [ "www" ], "address": "192.168.1.102", "port": 8000, "checks": [ { "http": "/health/endpoint/path", "interval": "15s", "timeout": "3s" } ] }, ... ] }

As you can see, based on the published ports, Registrator concluded that it is necessary to register 2 services ( http and https ). Moreover, Consul 'a now has all the necessary information on how to check the health of these services.

In the first case, the command " curl --silent --fail our-https-site.com " will be executed every 5 seconds and the result of the check will depend on the exit code of this command.

In the second case - every 15 seconds, Consul will jerk the URL transmitted by us. If the server's response code is 2xx , then our service is “healthy”, if 429 Too Many Requests , then in “emergency condition”, if everything else, then “ground him in peace”.

More examples and more detailed information can be found in the official documentation .

Consul Template

We decided where to store information about all services of our cloud, as well as how it will get there and automatically updated there. But we have not yet figured out how we will receive information from there and how, in consequence, we will transmit it to our services. This is what Consul Template will do.

To do this, you need to take the configuration file of our application ( which we want to configure ) and make a template out of it, according to the rules of the HashiCorp Configuration Language .

Let's look at a simple example with the Nginx configuration file:

 upstream app { least_conn; # list of all healthy services {{range service "tag1.cool-app" "passing"}}server {{.Address}}:{{.Port}} max_fails=3 fail_timeout=60s weight=1; {{else}}server 127.0.0.1:65535; # force a 502{{end}} } ...

After we explain to the Consul Template where this template is located, where to put the result and what command to execute ( it also knows how ) when it is changed ( in this case, reboot Nginx ), the magic will begin. In this case, Consul Template will receive the addresses and port numbers of all copies of the " cool-app " application, which are tagged with " tag1 " and are in a "healthy" state and add them to the configuration file. If there are no such applications, then, as you already guessed, everything that is after {{else}} will remain.

Each time the cool-app service is added and removed with the tag1 tag, the configuration file will be overwritten, and then Nginx will be rebooted. All this happens automatically and does not require intervention, we just run the required number of copies of our application and do not worry about anything.

More examples can be found in the official documentation .

Conclusion

To date, there are enough tools to solve the problem of finding services, but not so many tools that could solve this problem out of the box and immediately provide us with everything we need.

In the next part, I published a set of scripts for Ansible , which will configure all the above tools for us and we can begin in practice.

That's all. Thank you all for your attention. Stable clouds and good luck to you!

Follow me on Twitter , I talk about working in a startup, my mistakes and the right decisions, about python and everything related to web development.

PS I'm looking for developers to the company, the details in my profile .

Source: https://habr.com/ru/post/262397/

All Articles

Your cloud hosting in 5 minutes. Part 2: Service Discovery

Content

Problem

disadvantages

Requirements

Consul

Registrator

Consul Template

Conclusion

More articles: