Using Rebar and GProc

Using rebar

This tutorial may contain outdated information, since Rebar is very actively developed without preserving compatibility with previous versions.

When developing on Erlang, it is often necessary to collect dependencies from different sources, monitor their desired versions, create OTP releases for distributing projects. Things are quite routine and unpleasant. In order to develop less unpleasant moments, Basho created a very convenient tool - Rebar. In this article I will try to reveal the benefits of using it on a real-world example using third-party dependencies and creating configurable OTP releases.

Download Rebar at hg.basho.com/rebar/downloads/rebar . This is one single file containing several beam modules. It should be placed in any convenient place included in the search path for executable PATH files, for example, ~/bin/ or /usr/local/bin/ .
Let's start the project.
First, create a directory in which our project will be located ( gpt ) and go into it:

 $ mkdir gpt && cd gpt

')
In it, we will create a subdirectory where we will save the source files of our application directly:

 $ mkdir -p apps / gpt && cd apps / gpt

Making an application skeleton:

 $ rebar create-app appid = gpt

The appid parameter specifies the name of our application and, accordingly, the prefix of the source files.

 $ ls -1 src
 gpt_app.erl
 gpt.app.src
 gpt_sup.erl

Add to the preset for the .app file ( src/gpt.app.src ) the application description and the dependency on gproc:

   {description, "GProc tutorial"},
   ...
   {applications,
    [
     kernel,
     stdlib
     gproc% <--- Application depends on gproc
    ]},
   ...

Go back to the top level directory where our project is stored, create a rel subdirectory in it and go there:

 $ cd ../../
 $ mkdir rel && cd rel

Rel will contain the files needed to create a release - all that is required to run a project, all its runtime dependencies.
Using rebar create a stub for the node by passing its name in the nodeid parameter:

 $ rebar create-node nodeid = gptnode

Edit the file reltool.config :

   ...
   {lib_dirs, ["../deps", "../apps"]},% <--- In these directories, reltool will look for dependencies and our application
   {rel, "gptnode", "1",
    [
     kernel,
     stdlib
     sasl
     gproc,% <--- gproc application
     gpt% <--- Our application
    ]},
   ...

Then you can edit the files/vm.args file by changing, say, the node name:

 -name gptnode@127.0.0.1

 -sname gptnode @ localhost

Back in the top-level directory:

 $ cd ../

and create a file rebar.config with the following content:

 %% There will be dependencies here.
 {deps_dir, ["deps"]}.

 %% Subdirectories that rebar should look at
 {sub_dirs, ["rel", "apps / gpt"]}.

 %% Compiler Options
 {erl_opts, [debug_info, fail_on_warning]}.

 %% List of Dependencies
 %% In the gproc directory the master branch of the corresponding git repository will be cloned.
 {deps,
  [
   {gproc, ". *", {git, "http://github.com/esl/gproc.git", "master"}}
  ]}.

Now we are ready to create a release. Run several rebar commands (command output omitted):

 $ rebar get-deps
 $ rebar compile
 $ rebar generate

The get-deps command downloads dependencies. In our case, this is gproc application. The compile command obviously compiles all the source files, and generate creates a release.
The rel/gptnode can be safely moved to other hosts (of course, subject to binary compatibility, since the release includes the Erlang virtual machine). After creating the release, run what we got:

 (cd rel / gptnode && sh bin / gptnode console)

Make sure that all the necessary applications are running:

 (gptnode @ localhost) 1> application: which_applications ().
 [{sasl, "SASL CXC 138 11", "2.1.9.2"},
  {gpt, "GProc tutorial", "1"},
  {gproc, "GPROC", "0.01"},
  {stdlib, "ERTS CXC 138 10", "1.17.2"},
  {kernel, "ERTS CXC 138 10", "2.14.2"}]

We are interested in gpt and gproc. As you can see, they are on this list.

Using gproc

So, with rebar figured out, learned how to create a simple project and work with it. Proceed to gproc.
As you know, applications in Erlang, as a rule, consist of many processes that exchange messages.
In order for the processes to know whom to send which message, it is necessary to have a registrar converting some coordinates into a process identifier. By default, Erlang / OTP provides process registration under the atom name. This is wasteful, since atoms are not collected by the garbage collector, and once created, they live to complete the work of the entire node, which will necessarily lead to the exhaustion of all memory, if necessary, to register processes under unique names. Moreover, such an approach is inconvenient, since it would be necessary to convert different terms into an atom, write some rules for this, besides, the process can be registered only under one name. Registration of processes under the atom name using the erlang:register/2 function is allowed only for a small number of long-lived processes, whose name should not change, the analogue is global variables in imperative programming languages.
To circumvent these limitations, the following scheme is often used:

the registrar process is started, which creates the ets-table and is its owner;
when starting processes that require registration, they send a message to the registrar containing the coordinates for registration (any erlang-term) and their identifier;
the registrar writes this mapping to the ets-table and includes monitoring the process using erlang:monitor/2 ;
the registered process, upon completion, either explicitly sends a message about its deregistration, or the registrar receives a 'DOWN' message when this process crashes, and then deletes its entry from the ets-table;

This scheme is very often used, almost every application has its own implementation, with its own features and bugs. Of course, there is a natural desire to replace this recorder with something unique. And the solution came in the form of the developer Ulf Wiger and its gproc applications (https://github.com/esl/gproc).
The application API is available at github.com/esl/gproc/blob/master/doc/gproc.md .

Local registration

Consider the simplest case - local (at the current node) registration of a process with an arbitrary term.
The source code for the examples can be found here: github.com/Zert/gproc-tutorial.git
The code for the processes that we will register via gproc is in the gpt_proc.erl file. The gpt_sup.erl contains the code for the supervisor of this process group. When the gpt_sup:start_worker/1 function is called, our process will be launched and registered under the name that is passed to the function as the only argument. In this case, it is a number.
We started the node using the above command, and executed a series of processes with different identifiers in it:

 (gptnode @ localhost) 1> [gpt_sup: start_worker (Id) ||  Id <- lists: seq (1,3)].
 (gpt_proc: 29) Start process: 1
 (gpt_proc: 29) Start process: 2
 (gpt_proc: 29) Start process: 3
 [{ok, <0.61.0>}, {ok, <0.62.0>}, {ok, <0.63.0>}]

The function call gproc:add_local_name(Name) registers the process that calls it, under the name Name (this function is simply a wrapper over gproc:reg({n,l,Name}) , where n is the name , l is local ). After that, the gproc:lookup_local_name(Name) function will return the process ID.
Now we will tell one of the processes to start waiting for the process to start and register with the name 4. The code responsible for this is:

 handle_info ({await, Id},
             #state {id = MyId} = State) ->
     gproc: await ({n, l, Id}),
     ? DBG ("MyId: ~ p. ~ NNewId: ~ p.", [MyId, Id]),
     {noreply, State};

Here, the gproc:await/1 function is called with an argument that has the following form: {n, l, Id} . For some reason, it does not have a wrapper, but oh well.

 (gptnode @ localhost) 2> gproc: lookup_local_name (1)!  {await, 4}.
 {await, 4}

Having started the process with identifier 4, we will first see a message from it, and then from the first waiting process:

 (gptnode @ localhost) 3> gpt_sup: start_worker (4).
 (gpt_proc: 29) Start process: 4
 (gpt_proc: 45) MyId: 1.
 NewId: 4.
 {ok, <0.66.0>}

Let us stop the process of receiving the stop message:

 handle_info (stop, State) ->
     {stop, normal, State};

and stop it:

 (gptnode @ localhost) 4> gproc: lookup_local_name (1)!  stop
 stop

After this, the process is automatically deleted from the registrar database:

 (gptnode @ localhost) 5> gproc: lookup_local_name (1).
 undefined

Global registration

It is well known that Erlang is wildly distributed. This implies a transparent exchange of messages between nodes, in other words, having a process identifier, you can send a message to it without knowing which node it is on. Local registration of a process using gproc allows you to map an arbitrary term to a process identifier within one erlang node, while the other node cannot get an identifier value using this term.
In order for any node in the cluster to be able to register its processes so that they are available from other nodes, there is global registration. GProc implements the gproc:add_global_name/1 call to allow this action. Consider an example.
First, we will build two nodes combined in a cluster, and rebar will help us with this, since it has the ability to create configuration files using a predetermined pattern. When creating a cluster, consider the following details:

Set same cookie values on nodes
Ask them different names
Pass the appropriate parameters to the mandatory kernel application on each node
When using GProc, transfer the desired node roles to it.

The first two items are set in the files/vm.args :

 ## Name of the node
 -sname {{node}}

 ## Cookie for distributed erlang
 -setcookie gptnode

Here {{node}} is a placeholder, which will be filled when creating a release. The -setcookie virtual machine -setcookie sets the cookie value for this node; in a cluster, all the nodes should have the same values.
The second two items are set in the files/app.config . Here, placeholders will also be used:

  %% gproc
  {gproc, {{gproc_params}}},

 %% Kernel
  {kernel, {{kernel_params}}},

To fill the placeholders, specify in the file reltool.config that the previous two files should be treated as templates:

   {template, "files / app.config", "etc / app.config"},
   {template, "files / vm.args", "etc / vm.args"}

Create two configuration files, one for each node: vars/dev1_vars.config and vars/dev2_vars.config . The dev1_vars.config file will contain the following placeholder values:

 %% etc / app.config
 {gproc_params,
 "[
   {gproc_dist, {['gpt1 @ localhost'],
                 [{workers, ['gpt2 @ localhost']}]}}
  ] "}.

 {kernel_params,
 "[
   {sync_nodes_mandatory, ['gpt2 @ localhost']},
   {sync_nodes_timeout, 15000}
  ] "}.

 %% etc / vm.args
 {node, "gpt1 @ localhost"}.

For the dev2_vars.config file dev2_vars.config the sync_nodes_mandatory and node parameters sync_nodes_mandatory swapped. We analyze them in more detail.
The gproc_dist parameter refers to the gproc application, it is a tuple of two lists. The first list is the nodes that are able to become the leader (master), the second list contains key-value tuples, for now we only need one key - workers , which defines a list of nodes that are simple cluster members (slave).
The kernel application includes two parameters. The first, sync_nodes_mandatory is a list of nodes that are required to be present in the cluster. The second, sync_nodes_timeout is the time in milliseconds that each node will wait for the nodes from the previous list to appear. If the nodes do not appear during this time, the node will stop. Let's make its value 15 seconds in order to have time to run them both by hand.
The node value will be written to the startup parameters of the virtual machine, this is its name.
Now create two releases using the following rule from the Makefile:

 dev1 dev2:
     mkdir -p dev
     (cd rel && rebar generate target_dir = .. / dev / $ @ overlay_vars=vars/$@_vars.config)

Go to the dev/dev1 , launch the second terminal window (or create a new window in the screen ), dev / dev2 directory . . ./bin/gptnode console`. Let's see the list of available nodes in the first Erlang shell:

 (gpt1 @ localhost) 1> nodes ().
 [gpt2 @ localhost]

We see that the second node started up normally and connected to the cluster. In order not to be wise for a long time, we’ll globally register the current shell process under some term:

 (gpt1 @ localhost) 2> gproc: add_global_name ({shell, 1}).
 true

In another window, let's try to request a process identifier for this term:

 (gpt2 @ localhost) 2> gproc: lookup_global_name ({shell, 1}).
 <3358.70.0>

As you can see, successfully. By sending a message to this process, we can get it at the first node:

 (gpt2 @ localhost) 3> gproc: lookup_global_name ({shell, 1})!  {the, message}.
 message

We read it on the first node with the flush() command:

 (gpt1 @ localhost) 3> flush ().
 Shell got {the, message}
 ok

Conclusion

That's all. I started writing the article for myself, since the documentation on rebar is very scarce and is constantly forgotten after another use. Along the way, I started using gproc, and in order not to get up twice, I put everything in one article.

Source: https://habr.com/ru/post/112681/

All Articles