Consul.io Part 2

In the first part, we examined in detail what problems and tasks the distributed application architecture sets before us. We identified what tools we can use to solve these problems and noted the importance of discovery at the initial stage of the project. And also, Consul chose the main application on the basis of which we will consider the implementation of the discovery service.

In the final part, we will look at how Consul works with the DNS protocol, analyze the main requests to the HTTP API, see what types of Health Checks we can use and, of course, analyze what K / V storage is for. And most importantly, we will get closer acquainted with some features in practice.

DNS interface

Consul can respond to queries using the DNS protocol, and you can use any DNS client for queries. The DNS interface is available for components on the local host on port 8600. In addition to direct requests to Consul, you can register it as a resolver on the system and use it transparently for name resolution, proxying all external requests to the upstream “full” DNS server and resolving requests to private zone .consul yourself.
In order to implement primitive DNS balancing if there are several services in the directory with the same name and different IP addresses, Consul randomly shuffles the IP addresses in the response.
In addition to a direct request for domain name resolution within a cluster, you can search (lookup). The search can be performed both for the service (service lookup) and for the cluster node (node lookup).
The format of the domain name in a DNS query within the consul-cluster is rigidly defined and cannot be changed.
')

Cluster node

This is a normal DNS query that will return the cluster node's IP address by its name (the node name is set when the agent starts using the - node parameter). Consider the host name format for a DNS query:
[node].node[.datacenter].[domain]

[node] - mandatory part, the name of the node;
.node - a pointer to the fact that we are performing a node lookup;
[.datacenter] - optional part, data center name (consul “out of the box” can provide discovery for several data centers within one cluster. The default name is dc1. If the data center name is not specified, the current DC will be used. That is, the framework for which the agent is running, to which the request is being executed);
. [domain] - mandatory part, private Consul domain of the first level. .consul to .consul by default.

Thus, the domain name for a node with a name, for example, nodeservice, will look like this:
nodeservice.node.consul.
As we can see the name of the data center is missing, but the name can also be built like this:
nodeservice.node.dc1.consul.
Multiple nodes with the same name within the same DC are not allowed.

Service

A request to search for a service by name is executed on all nodes of the cluster. Unlike the request for resolving a host name, a query for finding a service provides more options. In addition, in fact, the IP-address of the service (that is, A-records), you can perform a request for SRV-record and find out the ports on which the service is running.
This is what a typical search request for all sites running a service with the name rls looks like:

 root@511cdc9dd19b:~# dig @127.0.0.1 -p 8600 rls.service.consul. ; <<>> DiG 9.9.5-3ubuntu0.7-Ubuntu <<>> @127.0.0.1 -p 8600 rls.service.consul. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26143 ;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;rls.service.consul. IN A ;; ANSWER SECTION: rls.service.consul. 0 IN A 172.17.0.2 rls.service.consul. 0 IN A 172.17.0.3 ;; Query time: 4 msec ;; SERVER: 127.0.0.1#8600(127.0.0.1) ;; WHEN: Thu Feb 18 07:23:00 UTC 2016 ;; MSG SIZE rcvd: 104

From this answer, you can see that there are two nodes in the cluster that are running a service with the name rls and that the Consul DNS interface returned the IP addresses of all the nodes. If we repeat the request several times, we will see that the records periodically change places, that is, the first place is not assigned to the first service found. This is an example of simple DNS balancing, which we talked about above.
If we request the SRV record, the answer will be:

 root@511cdc9dd19b:/# dig @127.0.0.1 -p 8600 rls.service.consul. SRV ; <<>> DiG 9.9.5-3ubuntu0.7-Ubuntu <<>> @127.0.0.1 -p 8600 rls.service.consul. SRV ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8371 ;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 2 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;rls.service.consul. IN SRV ;; ANSWER SECTION: rls.service.consul. 0 IN SRV 1 1 80 agent-two.node.dc1.consul. rls.service.consul. 0 IN SRV 1 1 80 agent-one.node.dc1.consul. ;; ADDITIONAL SECTION: agent-two.node.dc1.consul. 0 IN A 172.17.0.3 agent-one.node.dc1.consul. 0 IN A 172.17.0.2 ;; Query time: 5 msec ;; SERVER: 127.0.0.1#8600(127.0.0.1) ;; WHEN: Thu Feb 18 07:39:22 UTC 2016 ;; MSG SIZE rcvd: 244

The ANSWER SECTION lists the domain names of the nodes in the format required by Consul (note the nodes, but not the services!) And the ports on which the requested service is running. The IP addresses of the nodes (and, respectively, services) are listed in the ADDITIONAL SECTION response.

The format of the service name for the DNS query looks like this:
[tag.][service].service[.datacenter].[domain]

[tag.] - optional part. Used to filter the service by tags. If we have services with the same name but different tags, then adding a tag name will help filter the issue;
[service] - mandatory part, the name of the service;
.service - a pointer to the fact that we are performing a service lookup;
[.datacenter] - optional part, the name of the data center;
. [domain] - mandatory part, private Consul domain of the first level.

Thus, a service named nginx and having a tag called web can be represented by a domain:
web.nginx.service.consul

SRV requests for searching services in accordance with RFC-2782

In addition to the “usual” construction of a domain name, we can build it under the stricter RFC-2782 rules to fulfill the request for an SRV record. The format of the name looks like this:
_service._tag.service[.datacenter].[domain]
The service name and tag are underscore (_) as a prefix. (In the original RFC, instead of the tag, there should be a protocol name, this is done to prevent collisions upon request).
In the case of using the name in the format of RFC-2782, a service named nginx and having a tag called web will look like this:
_web._nginx.service.consul

The answer will be exactly the same as in the case of the “simple” query:

 root@511cdc9dd19b:/# dig @127.0.0.1 -p 8600 _rls._rails.service.consul. SRV ; <<>> DiG 9.9.5-3ubuntu0.7-Ubuntu <<>> @127.0.0.1 -p 8600 _rls._rails.service.consul. SRV ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26932 ;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 2 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;_rls._rails.service.consul. IN SRV ;; ANSWER SECTION: _rls._rails.service.consul. 0 IN SRV 1 1 80 agent-one.node.dc1.consul. _rls._rails.service.consul. 0 IN SRV 1 1 80 agent-two.node.dc1.consul. ;; ADDITIONAL SECTION: agent-one.node.dc1.consul. 0 IN A 172.17.0.2 agent-two.node.dc1.consul. 0 IN A 172.17.0.3 ;; Query time: 6 msec ;; SERVER: 127.0.0.1#8600(127.0.0.1) ;; WHEN: Thu Feb 18 07:52:59 UTC 2016 ;; MSG SIZE rcvd: 268

By default, all domain names within Consul have TTL = 0, that is, they are not cached at all. Need to keep that in mind.

HTTP API

The HTTP REST API is the core Clusul cluster management tool and provides a very wide range of capabilities. The API implements 10 endpoints, each of which provides access to the configuration of a specific Consul functional aspect. A detailed description of all edpoints is in the Consul documentation , and we briefly describe each of them to get an idea of the features of the API:

acl - access control;
agent - Consul agent management;
catalog - management of cluster nodes and services;
coordinate - network coordinates;
event - user events;
health - availability checks;
kv - Key / Value storage;
query - prepared queries;
session - session;
status - system status.

acl
As the name implies, acl manages access control for Consul services. We can regulate access to receive and modify data about services, nodes, user events, as well as control access to k / v storage.

agent
Manage local agent Consul. All operations available on this endpoint affect the local agent data. You can get information about the current state of the agent, its role in the cluster, as well as access to manage local services. Changes made to local services will be synchronized with all nodes in the cluster.

catalog
Manage the global Consul registry. It focuses on working with sites and services. As part of this endpoint, you can register and disable services and, in the case of working with services, using this section is more preferable than working through an agent . Work through the catalog simpler, clearer and contributes to anti-entropy .

coordinate
Consul uses network tomography to calculate network coordinates. These coordinates are used to build efficient routes within the cluster and many useful functions, such as, for example, finding the nearest node with a given service or switching to the nearest data center in the event of an accident. The API functions in this section are used only to obtain information about the current state of network coordinates.

event
Handling custom events. Custom events are used to perform any actions within the cluster: for example, to automatically deploy, restart services, run certain scripts or other actions within the orchestration process.

health
Check the current status of nodes and services. This endpoint is used only for reading and returns the current state of nodes and services, as well as lists of checks performed.

kv
This endpoint has only one method and is used to manage data in the distributed key / value storage provided by Consul. The only method in this endpoint is:
/v1/kv/[key]
The difference in processing is in the request method. GET will return the value by key, PUT will save the new value or overwrite the old one, and DELETE will delete the record.

query
Prepared queries management. Such queries allow you to perform complex manipulations on the Consul configuration and can be saved and executed later. Stored requests are assigned a unique ID. With it, the request can be executed at any time without the need for re-preparation.

session
Session mechanism in Consul is used to build distributed locks. Sessions are the connecting layer between the nodes performed by the checks and the k / v storage. Each session has a name and it can be stored in the repository. The name is used to implement locks in the framework of sequential actions with nodes and services in a competitive mode. The mechanism of the session is described in the documentation of Consul .

status
This endpoint is used to obtain cluster status information. Here you can find out the current leader and get information about all the cluster members.

Health checks

Earlier we talked about load balancing using DNS, and now we will consider a mechanism for checking the status of nodes and services. Health check is a periodically performed operation, the results of which can determine the state of the system being checked. In fact, this is automatic monitoring, which maintains the state of the cluster in a healthy state, cleans up inoperative nodes and services and returns them to work upon recovery. Consul supports several types of checks:

Script check - launch a specific script on a specific node with a specified frequency. Depending on the exit code (any non-zero code will mean that the test failed) turns the site or service on or off.
HTTP Check - a check that tries to load the specified URL and, depending on the response code, turns on or off the object being scanned (any 2xx - everything is fine, code 429 Too Many Requests generates a warning, other codes indicate an error).
TCP Check - a check that tries to establish a tcp connection at a specified interval to the specified address and port. Failure to establish a connection means that the test failed.
TTL Check - a check that should be periodically updated via the HTTP API. Its essence is that if a service has not updated this check within a certain interval, then it is marked as disabled. This is a passive check, that is, the service itself must periodically report that it works. If a report is not received within the specified interval, then the check is considered to be not passed.
Docker Check - check for services running in docker containers. Consul, using the Docker Exec API, can execute a script inside the container. The result of the check will depend on the exit code, any non-zero “fail” check.

K / V storage

The storage provided by Consul is a distributed key-value database and can be used to store any data available to any member of the cluster (in accordance with the ACL rules, of course). Services can store data in this repository that is necessary for other cluster members. These can be the values of configuration options, the results of any calculations or, as we indicated above, k / v storage can be used to implement distributed locks using the session mechanism. Using k / v-storage will allow us to make the cluster more efficient and reduce the percentage of manual intervention. Services can adjust their state depending on the information in the storage, guaranteed by the cluster. Please note: do not save any data related to the business logic of your services in this repository. The storage provided by Consul is used to store and distribute meta-information about the status of cluster members, and not about the data they process.

Conclusion

It is difficult to overestimate the role of the discovery service in the process of building a distributed architecture on large projects. Consul is great for this role. The product develops and does not stand still; a lot of useful functionality has been implemented, which is necessary for the smooth maintenance of a system with a large number of components. In addition, Consul is written in Go and is distributed as a single executable file, which makes the process of updating and supporting it very convenient.

Source: https://habr.com/ru/post/278101/

All Articles