📜 ⬆️ ⬇️

Safely speeding up the Erlang application using NIF on Rust

The article addresses the issue of integration of Erlang and Rust on the example of the implementation of the Burton Bloom probabilistic data structure, which allows to check the belonging of an element to a set with the necessary accuracy.


Language selection


Performance tests based on computational problems make it clear in which league Erlang is playing.




Since Erlang does not have ultrafast arithmetic, solving complex computational problems on it seems strange. However, it is well suited for questions arising in the development and operation of queuing systems. Erlang, having an excellent scheduler and garbage collector, coupled with a fast network and processing of binary data, copes with a highly competitive distributed environment. Thus, for myself, I assigned Erlang the role of system glue in the architecture of distributed server applications.

In real systems, local computational problems arise that inhibit the system and degrade the overall UX. It often happens that it slows down 1% of the code, and negatively affects the remaining 99% of the system. To solve this problem in Erlang, starting with version R13B03, there is a mechanism Native Implemented Functions (NIFs).


In the list of myths about Erlang in paragraph 2.7, developers warn that using the NIF interface should be the last measure, since using NIF is dangerous because of possible VM crashes caused by defects in the implementation of your NIF and does not always guarantee an increase in speed.


Official examples of NIF implementations are available for C. Code in C and C ++ is fairly easy to make unsafe, for example, by going beyond the memory structure or array, or by skipping the operation to release allocated resources. In my opinion, the problem is exacerbated by a context switching factor: when a programmer who mainly develops code for Erlang switches to low-level C, the likelihood of the problems described above increases, especially within the framework of burning deadlines.


Thus, I would like to get a solution as fast as in C / C ++, but safe and easily supported. Let's look at the most productive languages ​​in terms of computation.




In terms of language requirements, it is worth noting:
  1. Security. Solution should not under any circumstances break the Erlang VM
  2. Performance. Be comparable in performance with C ++
  3. Ability to use in NIF mode
  4. Development speed A good standard library and a large collection of third-party libraries are needed, providing a convenient language ecosystem.

Of the cohort of productive languages, Rust seems most appropriate. It offers good performance and a secure development model, as well as an active community. An additional advantage of Rust is data immunity and transparent multithreading model.


It should be noted that there is another optimization option. If we can neglect the time and overhead of an additional call through EPMD, then we can choose the way of writing Erlang Node, instead of NIF. To solve this problem, Java, Go, Rust, Ocaml (from personal experience) is suitable. Erlang Node can be run on the same machine or even on the other end of the earth.


Implementation


Review of existing solutions on Rust


After a quick search, there are several libraries for writing NIF on rust. Consider them:


  1. rustler . Perhaps the most popular and functional library, but the authors have concentrated their efforts on supporting Elixir. In https://github.com/hansihe/rustler/issues/127 they suggest dragging a mix into an erlang project. There is no documentation for use in Erlang.
  2. erlang-rust-nif . This is a low-level implementation of NIF and an approach to building an extension. The code looks like a simple translation from C. The assembly has boundary conditions and is not universal.
  3. erlang_nif-sys . Low-level and full-featured bundle. Is the basis for Rustler. It takes time and effort to write an NIF.
  4. bitwise_rust Demonstrates working with the scheduler. It is also a wrapper without syntactic sugar over C api.

Since one of the requirements points is the speed of development, Rustler looks the most attractive. However, I don’t want to add an extra dependency to the project in the form of Elixir and mix collector.


Rustler


Answering the question “why bother dragging the elixir project in erlang?” And following the KISS principle, it was decided to use a rustler, but without additional dependencies. As build system rebar3 is used. The easiest and quickest step is to define pre_hooks to compile our rust code.


To do this, we add in the hook test profile:


{pre_hooks, [ {"(linux|darwin|solaris|freebsd)", compile, "sh -c \"cd crates/bloom && cargo build && cp target/debug/libbloom.so ../../priv/\""} ]} 

In the battle environment, add the option --release , so in the battle profile add:


 {pre_hooks, [ {"(linux|darwin|solaris|freebsd)", compile, "sh -c \"cd crates/bloom && cargo build --release && cp target/release/libbloom.so ../../priv/\""} ]} 

After these manipulations, the priv/libbloom.so dynamic library priv/libbloom.so , completely ready for loading into the Erlang VM.
Details and an example of using rustler in an erlang project can be found in the project repository https://github.com/Vonmo/erbloom


Bloom filter


Since the rust ecosystem provides ready-made bloom filter implementations, we select the appropriate one and add it to cargo.toml . This project uses bloomfilter = "0.0.12"


The extension implements the following functions:


  1. new(bitmap_size, items_count) - filter initialization. bitmap_size and items_count are calculated values, there is a mass of ready-made calculators.
  2. serialize() - filter packing, for example, for later saving to disk or transmission over the network.
  3. deserialize() - restoration of the filter from the saved state.
  4. set(key) - adds an element to the set.
  5. check(key) - checks whether the element belongs to the set.
  6. clear() - clears the filter.

Erlang


It should be noted that loading an extension into Erlang is an absolute transparent process. After loading your module, on_load is called, in which you need to implement loading nif through erlang: load_nif / 2. At the same time, call processing will be transparent in Rust.


The rule of good tone is to generate an erlang error: nif_error / 1 in case NIF is not loaded.


A detailed description of the environment for assembling a project can be found in this article .


Results


As a result of the work done, we received a productive and safe extension. In our projects, this extension allows us to reduce the volume of calls to the data storage in some cases up to 10 times and serve the flow of calls of more than 500k RPS per machine.


The source code for the extension is available on github .


')

Source: https://habr.com/ru/post/349398/


All Articles