The article addresses the issue of integration of Erlang and Rust on the example of the implementation of the Burton Bloom probabilistic data structure, which allows to check the belonging of an element to a set with the necessary accuracy.
Performance tests based on computational problems make it clear in which league Erlang is playing.
In real systems, local computational problems arise that inhibit the system and degrade the overall UX. It often happens that it slows down 1% of the code, and negatively affects the remaining 99% of the system. To solve this problem in Erlang, starting with version R13B03, there is a mechanism Native Implemented Functions (NIFs).
In the list of myths about Erlang in paragraph 2.7, developers warn that using the NIF interface should be the last measure, since using NIF is dangerous because of possible VM crashes caused by defects in the implementation of your NIF and does not always guarantee an increase in speed.
Official examples of NIF implementations are available for C. Code in C and C ++ is fairly easy to make unsafe, for example, by going beyond the memory structure or array, or by skipping the operation to release allocated resources. In my opinion, the problem is exacerbated by a context switching factor: when a programmer who mainly develops code for Erlang switches to low-level C, the likelihood of the problems described above increases, especially within the framework of burning deadlines.
Thus, I would like to get a solution as fast as in C / C ++, but safe and easily supported. Let's look at the most productive languages in terms of computation.
Of the cohort of productive languages, Rust seems most appropriate. It offers good performance and a secure development model, as well as an active community. An additional advantage of Rust is data immunity and transparent multithreading model.
It should be noted that there is another optimization option. If we can neglect the time and overhead of an additional call through EPMD, then we can choose the way of writing Erlang Node, instead of NIF. To solve this problem, Java, Go, Rust, Ocaml (from personal experience) is suitable. Erlang Node can be run on the same machine or even on the other end of the earth.
After a quick search, there are several libraries for writing NIF on rust. Consider them:
Since one of the requirements points is the speed of development, Rustler looks the most attractive. However, I don’t want to add an extra dependency to the project in the form of Elixir and mix collector.
Answering the question “why bother dragging the elixir project in erlang?” And following the KISS principle, it was decided to use a rustler, but without additional dependencies. As build system rebar3 is used. The easiest and quickest step is to define pre_hooks to compile our rust code.
To do this, we add in the hook test profile:
{pre_hooks, [ {"(linux|darwin|solaris|freebsd)", compile, "sh -c \"cd crates/bloom && cargo build && cp target/debug/libbloom.so ../../priv/\""} ]}
In the battle environment, add the option --release
, so in the battle profile add:
{pre_hooks, [ {"(linux|darwin|solaris|freebsd)", compile, "sh -c \"cd crates/bloom && cargo build --release && cp target/release/libbloom.so ../../priv/\""} ]}
After these manipulations, the priv/libbloom.so
dynamic library priv/libbloom.so
, completely ready for loading into the Erlang VM.
Details and an example of using rustler in an erlang project can be found in the project repository https://github.com/Vonmo/erbloom
Since the rust ecosystem provides ready-made bloom filter implementations, we select the appropriate one and add it to cargo.toml
. This project uses bloomfilter = "0.0.12"
The extension implements the following functions:
new(bitmap_size, items_count)
- filter initialization. bitmap_size
and items_count
are calculated values, there is a mass of ready-made calculators.serialize()
- filter packing, for example, for later saving to disk or transmission over the network.deserialize()
- restoration of the filter from the saved state.set(key)
- adds an element to the set.check(key)
- checks whether the element belongs to the set.clear()
- clears the filter.It should be noted that loading an extension into Erlang is an absolute transparent process. After loading your module, on_load is called, in which you need to implement loading nif through erlang: load_nif / 2. At the same time, call processing will be transparent in Rust.
The rule of good tone is to generate an erlang error: nif_error / 1 in case NIF is not loaded.
A detailed description of the environment for assembling a project can be found in this article .
As a result of the work done, we received a productive and safe extension. In our projects, this extension allows us to reduce the volume of calls to the data storage in some cases up to 10 times and serve the flow of calls of more than 500k RPS per machine.
The source code for the extension is available on github .
Source: https://habr.com/ru/post/349398/
All Articles