📜 ⬆️ ⬇️

Cluster service on Erlang: from idea to deb-package

Task


You need to write a real service on the Erlang, which will work in a cluster. In addition, you need to maximally simplify the lives of those who will service the service.
Requirements:

For the sake of simplicity, the service will be a counter, which is given to each client by an integer increasing with each request (unique until the counter is restarted).

Technology


Choose all the most fashionable and modern:



Architecture


The cowboy will hang on some port, the request will be processed by our handler, who will make a call to the counter, then respond to the client and write an entry to the log.
The counter will be registered in global so that it can be easily accessed from any node in the cluster.
When starting, the counter tries to register if it does not exit (the counter is already registered on another node) - waiting for the opportunity to do it.
')

Application skeleton


We need to make an OTP application according to all canons, but with a minimum of effort.
Create an erdico directory for the project, do git init in it, download the erlang.mk file from the repository of the project of the same name and create a simple Makefile:
PROJECT = erdico ERLC_OPTS= "+{parse_transform, lager_transform}" DEPS = cowboy lager dep_cowboy = pkg://cowboy 0.10.0 dep_lager = https://github.com/basho/lager.git 2.0.3 include erlang.mk 

Mac OS / BSD users : It will take wget. In Linux, he seems to be everywhere now out of the box.
Note that the cowboy is included as a known package. The erlang.mk repository, although small, is small.

In the src / erdico.app.src file we describe our application (all parameters are required, otherwise erlang.mk or relx will break):
 {application, erdico, [ {description, "Hello, Upstart distributed Erlang service"}, {id, "ErDiCo"}, {vsn, "0.1"}, {applications, [kernel, stdlib, lager, cowboy]}, % run-time dependencies {modules, []}, % here erlang.mk inserts all application modules, not added automatically, required by relx {mod, {erdico, []}}, % application callback module {registered, [erdico]} % required by relx ]}. 


The src / erdico.erl file is created, but so far, apart from the -module (erdico) directive. write nothing there.
In this state, make should deflate the dependencies and collect everything it finds.

Application launch, cowboy and simplest request handler ( launcher , handler )


For simplicity, I have assembled the entire control code in one erdico module. Fanatics can make 4 modules here, and all the rest will take out those pieces, the logic of which suddenly becomes noticeably non-trivial and therefore worthy of a separate module.

HTTP server

It contains roughly the minimum configuration. What else happens there, you can read in the documentation
 start_cowboy() -> DefPath = {'_', erdico_handler, []}, % Catch-all path Host = {'_', [DefPath]}, % No virtualhosts Dispatch = cowboy_router:compile([Host]), Env = [{env, [{dispatch, Dispatch}]}], cowboy:start_http(?MODULE, 10, [{port, 2080}], Env). 


Request handler

Here, while everything is primitive:
 -module(erdico_handler). -behavior(cowboy_http_handler). -export([init/3, handle/2, terminate/3]). init(_Type, Req, _Options) -> {ok, Req, nostate}. handle(Req, nostate) -> {ok, Replied} = cowboy_req:reply(200, [], <<"hello\n">>, Req), {ok, Replied, nostate}. terminate(_Reason, _Req, nostate) -> ok. 


We collect, we start, we check

To build, just make.
To start, you need to specify the directory with dependencies and the directory with the binaries of our application.
  stolen @ node1: ~ / erdico $ ERL_LIBS = deps erl -pa ebin -s erdico 

Erlang console
 Erlang / OTP 17 [erts-6.1] [source-d2a4c20] [64-bit] [async-threads: 10] [hipe] [kernel-poll: false]

 Eshell V6.1 (abort with ^ G)
 1> 15: 01: 14.486 [info] Application lager started on node nonode @ nohost
 15: 01: 14.493 [info] Application ranch started on node nonode @ nohost
 15: 01: 14.506 [info] Application crypto started on node nonode @ nohost
 15: 01: 14.506 [info] Application cowlib started on node nonode @ nohost
 15: 01: 14.513 [info] Application cowboy started on node nonode @ nohost
 15: 01: 14.530 [info] Application erdico started on node nonode @ nohost

 1> 

It can be seen that even lager began to work somehow (apart from the console, he also wrote to the disk).

 stolen @ node2: ~ $ curl node1: 2080
 hello


Counter


Well, the application starts and runs. It is time to add meaning to its existence.
I will not go into the details of the implementation, just read the patch .

Demonstration

So far, we will run both Erlang nodes on the same host node1 - e1 @ node1 and e2 @ node1. To do this, the port on which the server hangs is configured from the command line.
On the first node, we wind the counter up to 20, on the second one - up to 1. We collect the cluster and see that the counter on the second node is killed, after which the first counter calls the counter from the second node.
e1 @ node1
 stolen @ node1: ~ / erdico $ ERL_LIBS = deps erl -pa ebin -s erdico -setcookie erdico -sname e1 -erdico port 2081
 Erlang / OTP 17 [erts-6.1] [source-d2a4c20] [64-bit] [async-threads: 10] [hipe] [kernel-poll: false]
 ...............
 (e1 @ node1) 2> erdico_counter: inc (10).
 {ok, 20}
 (e1 @ node1) 3> 16: 11: 30.422 [info] global: Name conflict terminating {erdico_counter, <10869.102.0>}
 (e1 @ node1) 3> erdico_counter: inc ().  
 {ok, 22}
e2 @ node1
 stolen @ node1: ~ / erdico $ ERL_LIBS = deps erl -pa ebin -s erdico -setcookie erdico -sname e2 -erdico port 2082
 Erlang / OTP 17 [erts-6.1] [source-d2a4c20] [64-bit] [async-threads: 10] [hipe] [kernel-poll: false]
 ..............
 (e2 @ node1) 1> erdico_counter: inc ().
 {ok, 1}
 (e2 @ node1) 2> net_adm: ping (e1 @ node1).
 pong
 (e2 @ node1) 3> 16: 11: 30.423 [error] Supervisor erdico child counter started with erdico_counter: start_link () at <0.102.0>
 (e2 @ node1) 3> erdico_counter: inc ().  
 {ok, 21}


Cowboy and counter


Well, it's simple .
Works!
 stolen @ node2: ~ $ curl node1: 2081
 value = 1
 stolen @ node2: ~ $ curl node1: 2082
 value = 2
 stolen @ node2: ~ $ curl node1: 2081
 value = 3
 stolen @ node2: ~ $ curl node1: 2082
 value = 4
 stolen @ node2: ~ $ curl node1: 2082
 value = 5
 stolen @ node2: ~ $ curl node1: 2081
 value = 6


The simple part of the post came to an end .

access.log


Lager is about the only living framework for writing logs in Erlang. Unfortunately, he lacks concise documentation with examples from life. I hope this post will become such an example for at least RuNet.
In addition, the Internet is not very generous with access.log entry examples for a cowboy . This, I hope, is also corrected by this post.

lager tracing

In the lager configuration, events are distributed to files according to their severity. This does not suit us, because to write HTTP server logs, we need to explicitly send an event to a specific log. To do this, there is a special lager in the lager called tracing, which we will use.
At this stage, we already need a config file.
Here we will redirect the crash log, create a log with more or less significant events, and also declare access.log, which will be written only through tracing when {tag, access} is in the event metadata. The format is more or less clear - the lines are inserted as strings, and the atoms are replaced with the values ​​from the metadata for the corresponding keys (hereinafter, I will tell you how to use it).
For all the configured logs, rotation is turned on at midnight with saving of 5 old files. Rotation by log size is disabled.
erdico.config
Whole file
 [ {lager, [ {crash_log, "logs/crash.log"}, {crash_log_size, 0}, {crash_log_date, "$D0"}, {crash_log_count, 5}, {error_logger_hwm, 20}, {async_threshold, 30}, {async_threshold_window, 10}, {handlers, [ {lager_file_backend, [{file, "logs/events.log"}, {level, notice}, {size, 0}, {date, "$D0"}, {count, 5}, {formatter, lager_default_formatter}, {formatter_config, [date, " ", time," [",severity,"] ",pid, " ", message, "\n"]}]}, {lager_file_backend, [{file, "logs/access.log"}, {level, none}, {size, 0}, {date, "$D0"}, {count, 5}, {formatter, lager_default_formatter}, {formatter_config, [date, " ", time," [",severity,"] ",pid, " ", peer, " \"", method, " ", url, "\" ", status, "\n"]}]} ]}, {traces, [ {{lager_file_backend, "logs/access.log"}, [{tag, access}], info} ]} ]} ]. 
Run, check
 stolen @ node1: ~ / erdico $ ERL_LIBS = deps erl -pa ebin -config erdico.config -s erdico -setcookie erdico -sname e1 -erdico port 2081
 Erlang / OTP 17 [erts-6.1] [source-d2a4c20] [64-bit] [async-threads: 10] [hipe] [kernel-poll: false]

 Eshell V6.1 (abort with ^ G)
 (e1 @ node1) 1> lager: log (notice, [{pid, self ()}], "hello ~ s ~ w", [world, 2.7]).
 ok
 (e1 @ node1) 3> lager: log (info, [{pid, self ()}, {tag, access}, {peer, "fake"}, {status, 418}], "", []).
 ok

Result:
 stolen @ node1: ~ / erdico $ cat logs / events.log 
 2014-06-28 17: 22: 43.994 [notice] <0.39.0> hello world 2.7
 stolen @ node1: ~ / erdico $ cat logs / access.log 
 2014-06-28 17: 25: 57.286 [info] <0.39.0> fake "Undefined Undefined" 418


cowboy onresponse hook

I'd like to dump the maximum work on the already ready code. Therefore, instead of inserting logging into every place that causes cowboy_req: reply / 4, we will insert logging into the cowboy itself. For this, as it turned out, there is even a special place in the form of a hook in response. Documentation is your friend.
The solution "on the forehead" looks like this and writes
good logs
 stolen @ node1: ~ / erdico $ cat logs / access.log 
 2014-06-28 17: 54: 44.429 [info] <0.103.0> 10.0.2.4 "GET http: // node1: 2081 /" 200
 2014-06-28 17: 54: 46.085 [info] <0.104.0> 10.0.2.4 "GET http: // node1: 2081 /" 200


non-blocking hook

Those who read the onresponse-hook documentation have already guessed that in the solution described above, the answer will be sent strictly after writing to the log.
This means that the podzalipshy logger (the disk, for example, is slow) will increase the response time.
And it also means that if we decide to write the request processing time to the log, then it will not include the time spent on logging, and may differ greatly from the point of view of the client.
Therefore, we once again look at the documentation and redo the hook so that the logging is done strictly after sending the answer to the client.
More correct hook
 access_log_hook(Status, Headers, Body, Req) -> {[{PeerAddr, _}, Method, Url], Req2} = lists:mapfoldl(fun get_req_prop/2, Req, [peer, method, url]), {ok, ReqReplied} = cowboy_req:reply(Status, Headers, Body, Req2), PeerStr = inet_parse:ntoa(PeerAddr), lager:info([{tag, access}, {peer, PeerStr}, {method, Method}, {url, Url}, {status, Status}], ""), ReqReplied. get_req_prop(Prop, Req) -> cowboy_req:Prop(Req). 


switchable log

For cases when you want to measure RPS, you need to be able not to write a line in the log for each request.
Let there be no hook if the configuration explicitly states that the log is not needed.
After this patch, adding the “-erdico log_access false” parameter to the launch string disables the log.

Releases and relx


Releases - probably one of the biggest pains in the development on Erlang. relx is made to save the user from this pain. (Spoiler: not really)

Just build release

After filling this file, the make call will compile the release in the _rel directory:
relx.config
 {release, {erdico, "0.1"}, [erdico]}. {extended_start_script, true}. 

I didn’t take off without an extended start script, but we still need it later.
Launch release
 stolen @ node1: ~ / erdico $ _rel / erdico / bin / erdico console
 Exec: /home/stolen/erdico/_rel/erdico/erts-6.1/bin/erlexec -boot /home/stolen/erdico/_rel/erdico/releases/0.1/erdico -env ERL_LIBS / home / stolen / erdico / _rel / erdico / releases / 0.1 / lib -config /home/stolen/erdico/_rel/erdico/releases/0.1/sys.config -args_file /home/stolen/erdico/_rel/erdico/releases/0.1/vm.args - console
 Root: / home / stolen / erdico / _rel / erdico
 / home / stolen / erdico / _rel / erdico
 Erlang / OTP 17 [erts-6.1] [source-d2a4c20] [64-bit] [async-threads: 10] [hipe] [kernel-poll: false]

 18: 39: 18.318 [info] Application lager started on node 'erdico@127.0.0.1'
 18: 39: 18.321 [info] Application cowboy started on node 'erdico@127.0.0.1'
 18: 39: 18.343 [info] Application erdico started on node 'erdico@127.0.0.1'
 Eshell V6.1 (abort with ^ G)
 (erdico@127.0.0.1) 1> 

As you can easily see, the lager settings were not included in the release. And also this thing inscribed a node name that is not very suitable for working in a cluster. These and other problems we will solve further.

Inclusion in the release of suitable settings

So, we want the release to start with the correct name of the node, and also to launch the node to connect to the sisters in the cluster. And so that these and other settings can be set in a file with a clear syntax that does not fall apart from the missing comma.
To begin, all zakardkodim .
Pay attention to the sync_nodes_optional and sync_nodes_timeout kernel options - together they make the node connect to the specified sisters at startup and wait for a response from them for 1 second. During this second, the global: sync () call in the counter is blocked, eliminating unnecessary deaths at the start.
In vm.args, obviously, you can write other options. But if you do not specify -name or -sname, the release does not start.
Now the release can be copied entirely to the second node, and after launching the cluster will magically assemble - the check with curl is passed. It is important that the erlang on the second node is not installed, that is, the release is self-sufficient.

Disclosure of variables release script

One of the great features relx gives us is variable disclosure. How this happens, you can see by finding the string RELX_REPLACE_OS_VARS in the launch script _rel / erdico / bin / erdico. Everything is so simple that it is not even flexible.
Parameterized config
Parameterize the list of sisters:
 {sync_nodes_optional, [${CLUSTERNODES}]} 

Run like this:
  RELX_REPLACE_OS_VARS = 1 CLUSTERNODES = erdico @ node2 _rel / erdico / bin / erdico console 


One problem: without disclosing variables, the release does not start now.

Hack: Non disclosure of variables by the release script

To release was launched with the disclosure, and without disclosure, I came up with such a hack. Since the disclosure will still go to the upstart script, in which the human config will be read at the same time, we will hide all the variables in the comment and add the variable that completes the comment. a patch that allows you to run the release as it is or with the indication of neighboring nodes -
  RELX_REPLACE_OS_VARS = 1 CLUSTERNODES = "erdico @ node2, erdico @ node1" NL = $ '\ n' _rel / erdico / bin / erdico console 


Combo Hack: Name Overlap

Let's make it so that the release can be started with dirty hands, without conflicting with the production. For this we need the parameter name to be parameterized too. At the same time through the parameterization we will enter the full name there (with FQDN).
On the one hand, you cannot leave vm.args without a node name. On the other - the previous hack allows you to add a line to the config, but does not allow to remove. On the third - if you give the erlang several names, but his choice is not very predictable.
It turned out that in vm.args everything that is written after the -extra directive goes into a separate section of the parameters and is not read by the kernel. We will take advantage of this .
Parameterized startup now happens like this:
  RELX_REPLACE_OS_VARS = 1 CLUSTERNODES = "'erdico@node2.example.net', 'erdico@node1.example.net'" FQDN = `hostname -f` NL = $ '\ n' _rel / erdico / bin / erdico console 


Build a deb package


Debian gives the developer a lot of pain. The pain begins with a heap of files in the debian directory, continues with the inability to specify either the root of the project, the alternative location of the debian directory, or the path for folding the assembled packages.
It is known that the collected packages are sent to the directory level above the directory with the source code of the project. From this it follows that all this filth must be buried deep.
Even in the upstart config, the options for scripting are very poor, so I had to wrap the start script in another conf_erdico.sh script, which prepares a valid environment.
It turned out that lager cannot write logs located under the symlink (due to the filelib features: ensure_dir / 1). Therefore it was necessary to stick hacks in the config to replace the paths to the logs.
In fact, since the external script was written anyway, it was possible to do all the replacements in the configs using sed. Let it remain as it is, there will be a proof-of-concept.
Used in packaging tricks
( whole commit )
  • the directory for the pkg / erdico assembly is made, which is put on the debian directory with all the giblets and additional files
  • Top-level Makefile acquired a deb target that refers to a Makefile in the package directory
  • A makefile in the package directory for the all target (build) calls top at make to build the current release.
  • To make upstart satisfied, the startup script is given a foreground parameter. When using traditional init, you can use parameters start, stop, ping
  • Since the startup script when editing configs puts the generated files strictly next to the originals, I had to make symlinks from / var / lib / erdico /
  • when hacks were stuck on the disclosure of variables in the lager configuration, the proplists work features were used
  • using the shell, the host list (FQDN) in /etc/erdico.conf is expanded to a list of nodes (with single quotes so that there are atoms)



We collect, install, customize, run!


The first (assembly) machine
 stolen @ node1: ~ / erdico $ make deb
 stolen @ node1: ~ / erdico $ sudo dpkg -i pkg / erdico_0.1_amd64.deb
 stolen @ node1: ~ / erdico $ scp pkg / erdico_0.1_amd64.deb node2:
 stolen @ node1: ~ / erdico $ sudo vim /etc/erdico.conf # CLUSTERHOSTS = "node1.example.net node2.example.net"
 stolen @ node1: ~ / erdico $ sudo service erdico start
Second car
 stolen @ node2: ~ $ sudo dpkg -i erdico_0.1_amd64.deb
 stolen @ node2: ~ $ sudo vim /etc/erdico.conf # CLUSTERHOSTS = "node1.example.net node2.example.net"
 stolen @ node2: ~ $ sudo service erdico start


Works!


After rebooting both machines
 stolen @ node1: ~ $ curl node1: 2080
 value = 1
 stolen @ node1: ~ $ curl node2: 2080
 value = 2
 stolen @ node1: ~ $ curl node1: 2080
 value = 3
 stolen @ node1: ~ $ curl node2: 2080
 value = 4
 stolen @ node1: ~ $ tail -5 /var/log/erdico/access.log
 2014-06-29 00: 43: 03.044 [info] <0.380.0> 10.0.2.4 "GET http: // node1: 2080 /" 200
 2014-06-29 00: 54: 34.563 [info] <0.424.0> 10.0.2.4 "GET http: // node1: 2080 /" 200
 2014-06-29 00: 54: 36.932 [info] <0.425.0> 10.0.2.4 "GET http: // node1: 2080 /" 200
 2014-06-29 00: 56: 10.709 [info] <0.383.0> 10.0.2.15 "GET http: // node1: 2080 /" 200
 2014-06-29 00: 56: 14.490 [info] <0.384.0> 10.0.2.15 "GET http: // node1: 2080 /" 200


Promised REST


Here , put.
Demo
 stolen @ node1: ~ $ curl node1: 2080
 value = 1
 stolen @ node1: ~ $ curl node2: 2080
 value = 2
 stolen @ node1: ~ $ curl node1: 2080 / inc / 400
 value = 402
 stolen @ node1: ~ $ curl node2: 2080
 value = 403
 stolen @ node1: ~ $ curl node1: 2080
 value = 404


Morality


Life is pain.
Lager is good, but it lacks the flexibility of the config (for example, once on a config, specify the root directory and the default file log options).
The cowboy is good, but you need to understand how it works so that the performance does not sag.
Debian is good, but the assembly of packages for it is made by mutants and for mutants.
Upstart is good, but it allows you to do too little in the configuration of the service, you have to put the logic into an additional script.
Erlang is good, until there is a need to give the application on it to support those who do not know it.
Managers of dependencies for erlang are, they work, but they have not solved the problem of dependency hell.
Build releases in Erlang still hurts, though less and less. Relx waits for commits, without which it is still inconvenient to use. In addition, he can go crazy if there is a cycle of symlinks or an assembled release somewhere in dependencies.

What else can be done in this application


First, you can replicate the counter. But if you send a notification about each request to all the nodes of the cluster, it will create a bottleneck.
Secondly, you can add a process that will constantly ping the neighbors specified in the settings. Without this, the Erlang is experiencing bad network gaps.
Thirdly, add a pen with status. Show on which nodes of the cluster this application is running, and on which of them the master is now.
Fourth, give the host in the header where the master is now located. A smart client will be able to go there immediately next time, so as not to drive traffic between nodes.
Fifthly, to cut out all the hacks from configs and make all substitutions with the help of sed and his friends.
Sixth, you can put the onresponse hook for a cowboy-lager bundle into a separate project, and learn how to automatically translate the format atoms into the values ​​of the request properties. In addition, you can also organize all sorts of metrics such as processing time and traffic to service the request.
Seventh, examine log4erl .
Eighth, examine epm . Make friends with dependency managers as a maximum task.

Source: https://habr.com/ru/post/227943/


All Articles