Consul: Service Discovery is easy, or say goodbye to config files

What is interesting here:

Review article on Consul ( http://consul.io ) - a system to support the discovery of services and distributed key-value storage. In addition to the Consul itself, consider the Consul-Template, a tool for managing service configurations that automatically reflects changes in the topology. The article will be interesting DevOps engineers, system architects, team lead projects and other interested microservice architectures.

Naturally, I cannot cover all aspects of the functioning and use of Consul, but the article will describe enough for an inquisitive reader to become interested and go on studying deeper.

Consul: “What kind of bird is it? What do you eat?”

Lyrical digression, but on the topic.
In the current world of huge amounts of data, where distributed processing systems are not something of the world of unattainable fiction but everyday things, the questions of their proper design and implementation become a very important point in the subsequent development of these systems. Everyone who at least once took part in the development of architectures of automatically scalable, distributed systems knows that this process is very laborious and requires a fairly serious knowledge stack of systems from which to build such architectural solutions. Given the rapid development of cloud computing and the emergence of IaaS platforms - deploying scalable systems has become quite simple. However, the interaction of components of such systems (integration of new components, removal of unused parts, etc.) always causes a headache for architects, devops engineers and programmers. For these purposes, you can create your own bike (configuration file templates, application self-registration support, etc.), you can use local or distributed key-value data storage systems (redis, zookeeper, etcd, etc.) and you can use service discovery systems.
')
Often, the term Service Discovery (I will use the abbreviation SD later) refers to network discovery systems ( SDP protocol, for example), but recently SD is also used in the software part of architectures for mutual detection of related system components. This is especially true microservice approach to the development of software systems. MSA (Micro Services Architecture), of which Netflix is one of the pioneers and popularizers, is increasingly becoming the standard for developing distributed, auto-scalable, highly loaded systems. And Consul is already widely used to provide SD in such systems. For example, Cisco uses it in its MSA engine, the Cisco MI .

Actually, Consul is a good combination of K / V storage and SD functionality. Well, now more.

Consul: How is he better?

A reasonable enough question is “Why do we need a Consul if we have a Zookeeper and he does an excellent job with SD?” The answer to the surface - Zookeeper, and similar systems (etcd, doozerd, redis, etc) do not provide SD functionality - their task is to just store data in one format or another and ensure their availability and consistency at any given time (provided that settings and usage, of course). Naturally, such a model will be quite enough to ensure SD, but the usability (settings, maintenance, etc.) often leaves much to be desired.

For example, Zookeeper: it’s always messing around with its cluster - from the initial setup (automated installation of the zk cluster using Ansible or SaltStack can cause a lot of trouble even to an advanced specialist), ending with software transfers using Zookeeper, links to the zk: // cluster type // 10.10.1.2:2181,10.10.1.3:2181.10.10.1.5:218181/app (you must first know where the cluster is located, all its nodes). Moreover, if for some reason the Zookeeper cluster “moves” to other addresses (very important in cloud environments, MSA architectures), all applications and services using this cluster will have to be restarted.
With Consul, everything is easier: the guys from HashiCorp have done “everything for the people”. Consul is distributed as 1 binary file (there is no need to monitor dependencies, use package managers) and any software using Consul always makes requests to it on localhost (there is no need to store a link to the cluster or master node of the service), Consul takes over. Using Gossip as a communication protocol makes Consul fast, fault-tolerant and not requiring a dedicated wizard for normal operation. In fact, the master as such is formally present (even the quorum of masters), but this is necessary mostly to survive the complete stop of all the nodes of the cluster (the wizards ensure that the operational data is periodically saved to disk, thereby guaranteeing the persistence of the data). As a result, for an application (microservice) using Consul, all work with SD is reduced to communication with localhost: 8500 - wherever the application would move - there will always be Consul agent. Moreover, to work with Consul you do not need to have any client libraries (as is the case with Zookeeper) - this is done using a simple and clear HTTP REST API (simple data, no more than 20 different API functions), or DNS services with SRV records (yes, one of Consul’s functions is to provide a DNS interface to registered services).
More details can be read here .

Consul: How to deliver and get started?

I’ll say right away that we will not dwell on installation and configuration in detail - for those who read the article, I think this will be a fairly simple exercise. Only one problem worthy of attention is not transparency in the search for installation documentation on the site, so here are the links: the initial installation (like homework - developing start / stop scripts for your beloved init.d / upstart / systemd - unnecessary - cross out) agents and cluster initialization .

A couple of comments on the choice of cluster topology. It is worth noting that Consul does not have a separate master who solely receives and distributes service configurations and data between nodes - absolutely any agent can be used to register services and data. Generally speaking, the master (more precisely, the quorum of the masts) as such is available, and its main role is to ensure the persistence of data during cluster restarts.

Consul: We register the service, we use requests

So, having a ready cluster (or one node for tests) we will start registering services. To begin with, we will generate a hypothetical scenario on the basis of which we will continue to deal with the work of Consul: suppose we have a classic web application consisting of several frontend services, several backend services and data storage - let it be mongodb. We’ll say right away that the test infrastructure and questions like: why MongoDB is not clustered ?, why HAProxy, and not Nginx? etc. I leave to the inquisitive reader as homework.
When working with Consul, we will distinguish between 2 types of services - active (using http rest api for self-registration and implementation of accessibility checks) and passive (requiring pre-prepared configuration files for Consul). The first will be services developed locally (the product of the company and its components), the second: third-party applications (not necessarily supporting work with Consul, or not supporting it at all, for example MongoDB).

So, we will enter registration for MongoDB service - we will create file / etc /consul.d/mongodb.json :

{ "service": { "name": "mongo-db", "tags": ["mongo"], "address": "123.23.34.56", "port": 27017, "checks": [ { "name": "Checking MongoDB" "script": "/usr/bin/check_mongo.py --host 123.23.34.56 --port 27017", "interval": "5s" } ] } }

The most important thing here is:
1. address / port - actually this data will be received by Consul clients in response to a request for information about the mongo-db service. The published address must be available.
2. Section “checks” - a list of checks allowing to identify whether the service is alive. This can be any script (returning 0 in case of normal service operation; 1 in case of service status warning and any other value in case service is unavailable), http check (some URL is requested and the status of the service is generated from the response — HTTP / 2XX — the service is alive , HTTP / 4XX, HTTP / 5XX - service unavailable).

More details on the site: description of the service , description of the checks .

A subsequent agent restart (specifying /etc/consul.d/ as the configuration directory) will accept this file and register MongoDB as a service available for SD. The script specified in the checks section makes a connection to MongoDB on the specified host (testing the availability of the service) and, for example, makes a request to a collection to check the availability of data.
Subsequently, you can check the registration using curl:

 ~/WORK/consul-tests #curl -XGET http://localhost:8500/v1/catalog/service/mongo-db [{"Node":"mpb.local","Address":"192.168.59.3","ServiceID":"mongo-db","ServiceName":"mongo-db","ServiceTags":["mongo"],"ServiceAddress":"123.23.34.56","ServicePort":27017}]

Or using the DNS server built into Consul:

 ~/WORK/consul-tests #dig @127.0.0.1 -p 8600 mongo-db.service.consul SRV ; <<>> DiG 9.8.3-P1 <<>> @127.0.0.1 -p 8600 mongo-db.service.consul SRV ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50711 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;mongo-db.service.consul. IN SRV ;; ANSWER SECTION: mongo-db.service.consul. 0 IN SRV 1 1 27017 mbp.local.node.dc1.consul. ;; ADDITIONAL SECTION: NEST.local.node.dc1.consul. 0 IN A 123.23.34.56 ;; Query time: 1 msec ;; SERVER: 127.0.0.1#8600(127.0.0.1) ;; WHEN: Thu Sep 17 17:47:22 2015 ;; MSG SIZE rcvd: 152

The use of one or another method of receiving data from Consul depends on the architecture of the requesting component (for scripts it is more convenient to use the DNS interface, for components written in high-level PL - REST requests or specialized libraries).

All services that can support self-registration should use the libraries for the necessary PLs: python , java , go , ruby , php . It is necessary not to forget, besides the registration of services, to competently develop scripts to check the availability of a particular service in order not to get a system with registered but not working services.

Consul: Goodbye configuration files.

Actually, we got to the very essence, - it is dedicated to those who read it out ... So at a certain point in time, we received the environment in which the services (mongodb, backend, - for example) are registered, for example, what benefits can we get?
In traditional distributed systems (without embedded SD), this technique is mainly used to add a new component to the system (for example, with increasing load, you need to create another backend):
1. An instantiated backend service is created (often with the help of orchestration systems like SaltStack / Puppet / Ansible / Hand-Made scripts / etc)
2. The template orchestration system generates new configuration files for services using the backend (load balancers, frontends, etc)
3. The same orchestration system generates a config file for this new backend service, indicating in it the contact information about mongodb and other dependent components.
4. All dependent services re-read configuration (or restart) by re-creating connections between themselves.
5. The system waits for convergence and becomes operational.

Such an approach is very expensive - it is necessary to make the generation of a config file, their distribution, restarting services, etc. To all, the orchestration system is involved in the process (a third-party component in relation to the working system), its accessibility should also be monitored.

SD allows you to significantly simplify this process (how exactly, the inquisitive reader already thought, I guess), but it requires changing the behavior of the services included in the system. And this is not only SD support (service registration and service discovery), but also Fail Tolerance (the ability of the service to safely experience changes in the topology of subordinate services), active use of KV stores for the exchange of configuration information, etc.
The only external component that will have to be used in such configurations is the Consul-Template , a tool for connecting various non-SD compatible systems to the Consul, for example: HAProxy. The task of this software is to track the registration / deregistration of services and change the configuration of subordinate services, i.e. when registering a new backend, the HAProxy config will be automatically rebuilt to include this new instance in the load balancing process. Read more about it here .
The actual use of SD based on Cunsul, Consul-Template, Consul-KV can in principle help completely get rid of any configuration files and leave everything at the mercy of Consul.

As a conclusion.

In general, due to the fact that Consul is in the phase of active development, some problems are possible (from what I noticed are problems with the collapse of the cluster when restarting all nodes with Consul), but the basic SD functionality works fine and any complaints are unnoticed. Let me remind you about the support of Consul'om set of data centers to ensure distribution.

Source: https://habr.com/ru/post/266139/

All Articles