Some time ago, we faced the task of designing and deploying a system for streaming video. The point was in the mass start / stop of instances where streaming video is reassembled and streaming to multiple media cdn providers (youtube, livestream, ustream, etc.) as well as to its own rtmp and ts destination points. Each instance required settings before launch. contained specific information for each client. It was also clear that the system should work in a large number of regions (at least in all points where there is an Amazon, and as a maximum - in any place where you can rent a server). Also the main requirement is the launch of the instance within 1-2 seconds maximum, so that it is transparent to the user.
It was at the beginning of 2015 that there was already talk that Docker would soon release a native clustering system. And on February 26, 2015 she did come out. It immediately became clear that this is for us a silver bullet and Swarm ideally falls on our project. The project was launched in May 2015.
After 11 months, we launched a second, more complex project in which it was necessary to organize static entry points (ip: port) for third-party equipment, while expanding the entire logic of operation and launch of instances dynamically in the desired region.
In the process of design and subsequent development, we used Docker swarm rather non-standard. And then came the idea of ​​how to dynamically manage and balance its infrastructure. Also, as part of this path, the Gobetween project appeared which was supposed to help us in the future to flexibly manage our infrastructure and also be easily replicable.
In this article I would like to share the idea and the fundamental implementation of building a distributed system.
I will specifically omit the security aspect - here everyone decides how / what is better to build. Some are closed with passwords / certificates, etc., some take out the management part in the network not intersecting directly with the public access networks (they live inside vpc and are interconnected by means of tunnels).
Schematic diagram of the docker cluster on Standalone Swarm.
Consider the system fundamentally - as a large number of regions and with one core (upavlyayuschim).
In this case, a system with one master and a large number of docker hosts, who announce themselves to the consul cluster, is considered. Why is that? Yes, because in essence, Manage is of no value in itself - there is no data in it that I would not want to lose. It doesn’t even have any settings - it is the task scheduler for the docker node list, which it takes from the Consul cluster. Moreover, you can make as many such instances as you wish, but there is no point - below I will explain why.
The biggest advantage of this system is everything is transparent, manageable, scalable. So let's start by configuring the cluster consul.
At the output, we want to get the following cluster structure:
3 nodes in RAFT with auto wizard selection.
sever1.consul.example.com 10.0.0.11 bootstrap consul server, consul agent sever2.consul.example.com 10.0.0.12 consul server, consul agent sever3.consul.example.com 10.0.0.13 consul server, consul agent
First, fix Consul on server (N) .consul.example.com:
We swing the consul with https://consul.io
No containers, consul under UPSTART.
$wget https://releases.hashicorp.com/consul/0.6.4/consul_0.6.4_linux_amd64.zip $unzip *.zip $mv consul /usr/sbin/consul
Check that Consul works:
$consul --version Consul v0.6.4 Consul Protocol: 3 (Understands back to: 1)
After installation on all three servers, we will prepare the cluster for the initial boot:
Generating a cluster token:
$consul keygen ozgffIYeX6owI0215KWR5Q==
Create a user from which Consul will run:
$adduser consul
Create on all 3 servers the necessary directories:
$mkdir -p /etc/consul.d/{bootstrap,server,client}
Create a directory for storing data Consul:
$mkdir /var/consul $chown consul:consul /var/consul
On the node that we will use to start the cluster for the first time (sever1.consul.example.com), we need to create the bootstrap config:
$vim /etc/consul.d/bootstrap/config.json
{ "bootstrap": true, "server": true, "datacenter": "production", "data_dir": "/var/consul", "encrypt": "", "log_level": "INFO", "encrypt":ozgffIYeX6owI0215KWR5Q==, "enable_syslog": true }
On ALL three servers you need to create a server config /etc/consul.d/server/config.json:
Server1:
{ "bootstrap": false, "server": true, "datacenter": "production", "data_dir": "/var/consul", "encrypt": "ozgffIYeX6owI0215KWR5Q==", "log_level": "INFO", "enable_syslog": true, "start_join": ["10.0.0.12", "10.0.0.13"] }
Server2
{ "bootstrap": false, "server": true, "datacenter": "production", "data_dir": "/var/consul", "encrypt": "ozgffIYeX6owI0215KWR5Q==", "log_level": "INFO", "enable_syslog": true, "start_join": ["10.0.0.11", "10.0.0.13"] }
Server3:
{ "bootstrap": false, "server": true, "datacenter": "production", "data_dir": "/var/consul", "encrypt": "ozgffIYeX6owI0215KWR5Q==", "log_level": "INFO", "enable_syslog": true, "start_join": ["10.0.0.11", "10.0.0.12"] }
Now create a UPSTART script for Consul on all servers: /etc/init/consul.conf:
description "Consul server process" start on (local-filesystems and net-device-up IFACE=eth0) stop on runlevel [!12345] respawn setuid consul setgid consul exec consul agent -config-dir /etc/consul.d/server
Initialize our cluster (Initial Start): at sever1.consul.example.com:
#consul agent -config-dir /etc/consul.d/bootstrap -bind="IP_ADDRESS"
The service should start and capture the terminal. In bootstrap mode, the server itself assigns itself to the master and initializes all the necessary data structures.
On the other two servers (sever2.consul.example.com, sever3.consul.example.com), the consul is simply started in server mode.
#start consul
These servers connect to the bootstrap server. At the moment we have a cluster of three servers, two of which are working in the normal mode of the server, and one is in the initialization mode, which means that he himself makes decisions on data distribution without asking the others.
Now we can stop the bootstrap server and restart the consul in the standard server mode:
On the bootstrap server, click:
CTRL-C
The consul will stop and then restart in standard server mode:
sever1.consul.example.com:
#consul start
Check that everything went well with this command:
#consul members -rpc-addr=10.0.0.11:8400
The output should be:
Node Address Status Type Build Protocol DC server1.consul.example.com 10.0.0.11:8301 alive server 0.6.4 2 production server2.consul.example.com 10.0.0.12:8301 alive server 0.6.4 2 production server3.consul.example.com 10.0.0.13:8301 alive server 0.6.4 2 production
So, we have a ready and operational consul cluster.
In this case, we will use the load balancer and all requests to the consul cluster will go through it. In the case of Amazon, the use of consul agents from different VPCs, not to mention the regions, has many unresolved problems (announcing your internal IP instead of the one specified at the start of the external one, which destroys the second stage of entering the node / server in the cluster), and raising the consul cluster in each region and to configure synchronization - from my point of view at this stage is not rational.
By analogy - download and install on the server, where our balancer will be located, Consul.
then we will create an agent config:
$vim /etc/consul.d/server/config.json
{ "server": false, "datacenter": "production", "data_dir": "/var/consul", "ui_dir": "/home/consul/dist", "encrypt": "ozgffIYeX6owI0215KWR5Q==", "log_level": "INFO", "enable_syslog": true, "start_join": ["10.0.0.11", "10.0.0.12", "10.0.0.13"] }
create an upstart for the agent:
description "Consul server process" start on (local-filesystems and net-device-up IFACE=eth0) stop on runlevel [!12345] respawn setuid consul setgid consul exec consul agent -config-dir /etc/consul.d/agent
we start the agent:
$start consul
check what happened with us:
$consul members -rpc-addr=10.0.0.11:8400
after launch, the output should be something like:
Node Address Status Type Build Protocol DC server1.consul.example.com 10.0.0.11:8301 alive server 0.6.4 2 production server2.consul.example.com 10.0.0.12:8301 alive server 0.6.4 2 production server3.consul.example.com 10.0.0.13:8301 alive server 0.6.4 2 production lb.consul.example.com 10.0.0.1:8301 alive client 0.6.4 2 production
In previous articles I described a part of the functionality of our LB.
in this case, the easiest way is to use EXEC discovery together with the Consul agent installed locally - this allows you to look at the consul cluster from the inside (after all, in the future it may well turn out that we will add / remove some of the nodes and then we will not need to reconfigure).
Download the latest release (the version may change later, so watch out for the releases):
$wget https://github.com/yyyar/gobetween/releases/download/0.3.0/gobetween_0.3.0_linux_amd64.tar.gz $tar -xvf gobetween_0.3.0_linux_amd64.tar.gz $cp gobetween_0.3.0_linux_amd64/gobetween /usr/sbin
So, let's create a script to determine the list of available consul backend servers, which will return a list of servers that have been checked and which are registered to the consul and marked with the CONSUL service
create a directory for configs and discovery scripts
$mkdir /etc/gobetween/
create a discovery script:
$vim /etc/gobetween/consul_node_discovery.sh
such content:
#!/bin/bash curl -Ss http://0.0.0.0:8500/v1/catalog/service/consul |jq '.[] | .Address' |sed 's/"//g'| sed 's/$/:8500/'
if you run the script by hand, the output should be something like:
10.0.0.11:8500 10.0.0.12:8500 10.0.0.13:8500
With this type of retrieving the list of working servers, we will not use helscheks - we will leave this to the consul himself to the cluster. Now configure Gobetween itself:
$vim /etc/gobetween/gobetween.toml
[logging] level = "info" output = "stdout" [api] enabled = true bind = ":8888" [api.basic_auth] login = "admin" password = "admin" [defaults] max_connections = 0 client_idle_timeout = "0" backend_idle_timeout = "0" backend_connection_timeout = "0" [servers.consul] bind = "10.0.0.1:8500" protocol = "tcp" balance = "iphash" max_connections = 0 client_idle_timeout = "10m" backend_idle_timeout = "10m" backend_connection_timeout = "5s" [servers.consul.access] default = "deny" rules = [ "allow 99.99.99.1/24", # region1 docker nodes ip`s pool "allow 199.199.199.1/24", # region 2 docker nodes pool "allow 200.200.200.200/32", #mage node "allow 99.99.98.1/32 " #region-1 load balancer ] [servers.consul.discovery] failpolicy = "keeplast" interval = "10s" timeout = "5s" kind = "exec" exec_command = ["/etc/gobetween/consul_node_discovery.sh"]
Now let's create the upstart script /etc/init/gobetween.conf:
$vim /etc/init/gbetween.conf
# gobetween service description "gobetween" env DAEMON=/usr/sbin/gobetween env NAME=gobetween env CONFIG_PATH=/etc/gobetween/gobetween.toml export GOMAXPROCS=`nproc` start on runlevel [2345] stop on runlevel [!2345] kill signal INT respawn respawn limit 10 5 umask 022 expect stop respawn script exec $DAEMON -c $CONFIG_PATH 2>&1 end script post-stop script pid=`pidof gobetween` kill -9 $pid end script
and now we will start the balancer:
$start gobetween
Now we have a cluster, you can go to http: // lb_ip: 8888 / servers / consul and check that the list of servers of the consul was determined successfully.
Our nodes will live in 2 subnets:
99.99.99.1/24 - region 1 199.199.199.1/24 -region 2
Also external ELASTIC IP in Amazon - 50.50.50.50
So, here I see no point in repeating the steps of installing the docker to the server. They can be read here . I will focus only on specific issues. It should also be noted - this guide works for 12 and 14 Ubuntu branches, for Ubuntu 16 the same settings of the docker of the daemon are required, but they are made a little differently.
Let's start with the configuration of the docker daemon and its launch.
edit the initialization string docker daemon:
vim /etc/default/docker
you need to add the line below, the remaining lines - comment out:
DOCKER_OPTS="-H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --label region=region1 "
Next, restart the Docker:
$service docker restart
After that, you need to make sure that the docker daemon starts up with the necessary settings:
$ps ax |grep docker
should see something like this:
10174 ? Ssl 264:59 /usr/bin/docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --label region=region
install Docker Swarm on our server with the docker already installed. I do not use Docker Swarm in containers, I like when everything is transparent and when I can initialize exactly as I want and control it with a simple upstart.
So, the easiest way to get a binary file with a backup is to download its image to yourself on a local machine, then unpack the image and extract it in any possible way. I prefer to collect it for myself. Deciding how best I will leave the reader.
Imagine that we already have a Docker Swarm binary, we copied it to our custom docker server. now we just need to configure Swarm by writing an upstart script:
$vim /etc/init/swarm.conf ```upstart description "Consul server process" start on (local-filesystems and net-device-up IFACE=eth0) stop on runlevel [!12345] respawn setuid root setgid root exec bash -c 'swarm join --ttl 20s --heartbeat 4s --advertise=SERVER_IP:2375 consul://50.50.50.50:8500/swarmcluster/ '
SERVER_IP you can easily extract it yourself, it is inserted into me by substituting from Elastic IP during server creation in Amazon with the help of Ansible. This is exactly the ip by which the SWARM Manage will connect to this docker host.
now we start our swarm join:
start swarm
you can check it by request
$curl -Ss http://50.50.50.50:8500/v1/kv/swarmcluster/docker/swarm/nodes/?keys&seperator=/&dc=production
and should get an answer:
$ ["swarmcluster/docker/swarm/nodes/SERVER_IP:2375"]
Now also scale the cluster to any number of regions. Changing in the settings of the docker daemon tags (lables) for each region. You can also add several tags to each server (it is convenient to divide the server by region and also, say, processor performance, memory size, disk type).
So, now let's proceed directly to the installation of our distribution manager. In fact, installing it is not much different from installing Swarm Join.
Again, repeat the steps to copy the binary to the server and then create the UPSTART script:
$vim /etc/init/swarm.conf
description "Consul server process" start on (local-filesystems and net-device-up IFACE=eth0) stop on runlevel [!12345] respawn setuid root setgid root exec bash -c 'swarm manage -H tcp://$0.0.0.0:2377 -strategy "binpack" consul://50.50.50.50:8500/swarmcluster/'
SWARM_MANAGE_IP is the ip of our Manage. In our case - 200.200.200.200. Let us consider -strategy, this option determines the distribution of containers on the nodes corresponding to all parameters of the samples. With the binpack strategy, the first node is filled first with containers, and only then the second. If you have hundreds of starts / stops of containers per hour - this avoids fragmentation and allows you to remove unnecessary nodes from the cluster.
There are 3 types of container distribution strategies:
spread - distribution to the least loaded node
binpack - the most dense packing containers
random - there is nothing to say and everything is clear :) it is used only for debug.
now finally run our swarm manage:
$service swarm start
and check what happened with us:
$docker -H 0.0.0.0:2377 info
and get something like this:
Containers: 1 Images: 3 Server Version: swarm/1.2.4 Role: primary Strategy: binpack Filters: health, port, dependency, affinity, constraint Nodes: 3 host1: 99.99.99.1:2375 â”” Status: Healthy â”” Containers: 0 â”” Reserved CPUs: 0 / 8 â”” Reserved Memory: 0 B / 16.46 GiB â”” Labels: executiondriver=, kernelversion=3.13.0-86-generic, operatingsystem=Ubuntu 14.04.4 LTS, region=region-1, storagedriver=devicemapper â”” Error: (none) â”” UpdatedAt: 2016-08-21T14:40:03Z host3: 99.99.99.2:2375 â”” Status: Healthy â”” Containers: 0 â”” Reserved CPUs: 0 / 2 â”” Reserved Memory: 0 B / 3.86 GiB â”” Labels: executiondriver=native-0.2, kernelversion=3.13.0-74-generic, operatingsystem=Ubuntu 14.04.3 LTS, region=region-1, storagedriver=devicemapper â”” Error: (none) â”” UpdatedAt: 2016-08-21T14:40:42Z host3: 199.199.199.1:2375 â”” Status: Healthy â”” Containers: 1 â”” Reserved CPUs: 0 / 2 â”” Reserved Memory: 512 MiB / 3.86 GiB â”” Labels: executiondriver=native-0.2, kernelversion=3.13.0-74-generic, operatingsystem=Ubuntu 14.04.3 LTS, region=region-2, storagedriver=devicemapper â”” Error: (none) â”” UpdatedAt: 2016-08-21T14:40:55Z Kernel Version: 3.13.0-44-generic Operating System: linux CPUs: 12 Total Memory: 24.18 GiB Name: lb.ourcoolcluster.com
in fact, you can already run the container:
$docker tcp://0.0.0.0:2377 run -d -P -e constraint:region==region-1 hello-world
More information about filters and policies can be found here .
So, we have a cluster in which it is possible to start the container depending on the region, and other data. What next? - try to organize by the entry point for each region.
Now we will try to build a discovery service on exclusively the internal mechanisms of Sandalone Swarm. We will use labels when launching containers. You can do it by hand, or write your own engine that will launch the necessary container and at the same time rest api to work with the Balancer. In this scheme, LB is configured once at the start of a new service in a region where it was not there before. after that, you can safely start the replicas of the service as the load grows, and also to stop the container when it drops - the balancer will make the discovery of the list of nodes providing the service.
Also, if necessary, you can easily keep a replica of the consul in the region and also a replica of the Swarm manage. But we will consider the simplest scheme.
The general scheme of how everything will work:
We will omit the Gobetween installation, we will only give the config with which it will be launched:
[logging] level = "info" output = "stdout" [api] enabled = true bind = ":8888" [api.basic_auth] login = "admin" password = "admin" [defaults] max_connections = 0 client_idle_timeout = "0" backend_idle_timeout = "0" backend_connection_timeout = "0"
This is quite enough to start the balancer. we will conduct all subsequent procedures through rest api - usually special services are involved. Also, to simplify testing - on each docker server, create a file / tmp / test and record there information unique for each server. For example "host1" and "host2"
For example, run the container:
$docker run -l service=region-1.nginx -d -p 22001:80 -e constraint:region==region-1 -v /tmp/test:/usr/share/nginx/html:ro nginx $docker run -l service=region-1.nginx -d -p 22001:80 -e constraint:region==region-1 -v /tmp/test:/usr/share/nginx/html:ro nginx
If we in the region-1 have 2 or more nodes, then the container will start (Docker Swarm checks the availability of ports for mapping). In the case of one docker node in the region, you can run 2 containers in this way:
$docker run -l service=region-1.nginx -d -p 22002:80 -e constraint:region==region-1 -v /tmp/test:/usr/share/nginx/html:ro nginx $docker run -l service=region-1.nginx -d -p 22001:80 -e constraint:region==region-1 -v /tmp/test:/usr/share/nginx/html:ro nginx
Containers run in the right region and work. Now it's time to set up our balancer:
$curl --user admin:admin -XPOST "http://50.50.50.50:8888/servers/r1nginx" --data ' { "bind":"LB_IP:LB_PORT", "protocol": "tcp", "balance": "weight", "max_connections": "0", "client_idle_timeout": "10m", "backend_idle_timeout": "10m", "backend_connection_timeout": "1m" "healthcheck": { "kind": "ping", "interval": "2s", "timeout": "1s" }, "discovery": { "kind": "docker", "docker_endpoint" : "http://50.50.50.50:2377", "docker_container_private_port" : "80", "docker_container_label":"service=region-1.nginx" } } '
Where:
LB_IP - IP address looking in the direction of checking on the server with a running balancer.
LB_PORT - tcp port on LB_IP looking in the direction of checking on a server with a running balancer.
Now you can check what happened with us:
$curl -sS http://LB_IP:LB_PORT host1 $curl -sS http://LB_IP:LB_PORT host2
So, we have considered one of the simplest, but at the same time and quite functional ways of building a geo-distributed cluster on Docker Swarm standalone. Installation is fairly simple and transparent, as is troubleshooting. Thinking through this article, I was aware that I couldn’t cover many aspects of building and operating a system of this type, I was more likely to make the reader look at the problem from a different angle - the necessary sufficiency and asceticism, because it’s difficult to do it easily, but it’s easy and coherent to do complicated.
Security issues and HA I left to the discretion of readers, perhaps, if there is interest, I will try to highlight them in subsequent articles.
Source: https://habr.com/ru/post/308182/
All Articles