
Recently, the network is often mentioned "young and promising" language Rust. He awakened in me curiosity and the desire to do something more or less useful on him, in order to somehow try on - whether he fit me. This resulted in a rather curious, as it seems to me, experience of crossing the grass with the hedgehog with the assistance of the cuckoo.
And so, I did this. There is a project on node.js. There is a functional that requires a hash count. Moreover, quite often - for almost every incoming request. Since this hash is not something that should protect me from collisions and is generally not needed for security reasons, but for convenience, the adler32 algorithm is used. It provides a short output value.
')
For some nonsense, it is not in node.js. Let me explain why this is ridiculous. This algorithm is usually used in compression, in particular, it is used by gzip. In node.js there is a standard implementation of gzip in the zlib module. That is, adler32 is actually there, but in an implicit form. In Python, for comparison, in a similar module it is available and can be used.
Anyway. We take a third-party package from npm. I took this one:
adler32 - mainly because it can integrate with the crypto module and can be used just like other hash algorithms. It's comfortable. In this case, I did not particularly think about performance. Whatever it is - it's a penny. But since I was planning an experiment, this very adler32 was chosen as a victim.
In general, let's get started. Put Rust easy. Documentation is also quite intelligible in both
Russian and
English . Rust is taken version 1.15. Fun fact: the documentation in Russian is not a direct English translation and is slightly different in structure. In particular, it added an example of working with threads.
In addition to Rust itself, there is also node.js version 6.8.0, Visual Studio 2015 and Python 2.7 - all this is needed.
Now let's make a preliminary measurement.
Node.js
for (var i=0; i<5000000; i++) { var m = crypto.createHash('adler32'); m.update("- , "); m.digest('hex'); }
The average result of three launches:
41.601 seconds. The best result: 40,206
To compare with something, let's take for a start the native implementation of the hash in node.js. Let's say sha1. Having executed the exact same code, but specifying sha1 as the algorithm, I received the following numbers:
The average result of three starts:
9.737 seconds. Best score: 9,321
Well, maybe it is this Adler? But wait, wait. Let's still try to do something on Rust.
Rusty
And so, on Rust there is a third-party library compress, which is available in this their Cargo. She also knows how to gzip and provides the ability to read adler32. It looks like this:
for i in 0..5000_000 { let mut state = adler::State32::new(); state.feed("- , ".as_bytes()) }
The average result of three starts:
2.314 seconds. The best result: 2,309
Not bad!
Node.js and FFI
Since Rust is compiled into C-compatible code, it can be compiled into a dynamic library and connected using
FFI . Node.js has a special package for this, which needs to be installed separately:
npm install ffi
If all is well with you, then after that it will be possible to connect external libraries written in C or compatible with it.
So, it is necessary to convert this crusher on the plant now to the library. In short, the code looks like this:
extern crate compress; extern crate libc; use libc::c_char; use std::ffi::CStr; use std::ffi::CString; use compress::checksum::adler; #[no_mangle] pub extern "C" fn adler(url: *const c_char) -> *mut c_char { let c_str = unsafe { CStr::from_ptr(url).to_bytes() }; let mut state = adler::State32::new(); state.feed(c_str); let s:String = format!("{:x}", state.result()); let s = CString::new(s).unwrap(); s.into_raw() }
As you can see, everything has become a bit more complicated. At the input, the function receives a C string, which it overtakes in bytes, counts the hash, converts it to hex, then again overtakes the C line, and only then returns it.
In addition, in the file Cargo.toml you need to specify what you need to compile into a dynamic library. There are also dependencies:
[package] name = "adler" version = "0.1.0" authors = ["juralis"] [lib] name = "adler" crate-type = ["dylib"] [dependencies] compress = "*" libc = "*"
Here it is. Now it will be compiled into the library. What type - depends on the target platform. I got a dll at the output, since I did all this from under Windows and indicated the appropriate compilation parameters:
cargo build --release --target x86_64-pc-windows-msvc
Well. Grab this dll, put it somewhere closer to the project on node.js and add something to the code:
var ffi = require('ffi'); var lib = ffi.Library('adler.dll', { adler: ['string', ['string']] }) for (var i=0; i<5000000; i++) { lib.adler("- , ") }
The average result of three starts:
27.882 seconds. The best result: 26,642
Well ... Something is somehow not what I would like. Apparently, all these joys with external challenges are quite expensive. However, it still works faster. But can it be done even faster? Can.
Node.js and C ++ addon
In node.js, as it is known, so-called
add -
ons are supported. Why not try it? The only problem is that I generally speaking in C ++ have no tooth in the foot. However, there are good people who wrote a little help. Here it is about how it works. As it turned out, I am not the first to decide to have fun in this way. However, there is a rather trivial example of calculating Fibonacci numbers and, accordingly, much remains unclear. And since I do not know C ++, this of course presented a problem.
But it turned out that humanity went much further in the matter of inventing all sorts of perversions and some kind person wrote a small Cpp-wrapper
generator for Rust libraries. It analyzes the source code for Rust, takes those functions that fit the criteria and generates some code on the pluses. And for that Rust-code, which was given above, we got such a piece of C ++ code
In addition, from the previous comrade, I took an example of the bindings.gyp file:
{ "targets": [{ "target_name": "adler", "sources": ["adler.cc" ], "libraries": [ "/path/to/lib/adler.dll" ] }] }
I also need an index.js file with obedience:
module.exports = require('./build/Release/addon');
Now we need to collect all this joy using node-gyp. But it refused to compile from me. I had to sort out a little about what was going on there.
First you need to put the package nan (Native Abstractions for Node.js):
npm install nan -g
And add a path to it in bindings.gyp (somewhere on the same level with libraries):
"include_dirs" : [ "<!(node -e \"require('nan')\")" ]
There the compiler will look for the header file from this very nan. After that it was necessary to pick a little more positive file. Here is the final version, which I deigned to compile:
#include <nan.h> #include <string> #include <node.h> #pragma comment(lib,"Ws2_32.lib") #pragma comment(lib,"userenv.lib") using std::string; using v8::String; using Nan::New; extern "C" { extern "C" char * adler(char * url); } NAN_METHOD(adler) { Nan::HandleScope scope; String::Utf8Value cmd_url(info[0]); string s_url = string(*cmd_url); char *url = (char*) malloc (s_url.length() + 1); strcpy(url, s_url.c_str()); char * result = adler(url); info.GetReturnValue().Set(Nan::New<String>(result).ToLocalChecked()); free(result); free(url); } NAN_MODULE_INIT(InitAll) { Nan::Set( target, New("adler").ToLocalChecked(), Nan::GetFunction(New<FunctionTemplate>(adler)).ToLocalChecked() ); } NODE_MODULE(addon, InitAll)
However, before this happened, one more thing came to light. My library was compiled as dynamic, and node-gyp required static. Therefore, in Cargo.toml you need to change this line here:
crate-type = ["dylib"]
on this one:
crate-type = ["staticlib"]
Then again you need to compile:
cargo build --release --target x86_64-pc-windows-msvc
In addition, you need to remember now to change the path to the library in bindings.gyp to the lib version:
"libraries": [ "/path/to/lib/adler.lib" ]
And then it should be all together and get the coveted file adler.node.
In node again we change the code to generate the hash:
var adler = require('/path/to/adler.node'); for (var i=0; i<5000000; i++) { adler.adler("- , "); }
The average result of three starts:
7,802 seconds. Best score: 7,658
Oh, it's already a couple of seconds faster than even the native way of calculating sha1! Looks very nice!
Basically, what is 5 million times hash to calculate and spend 40 seconds on it? It is approximately as if a little less than one hundred thousand requests came to you in a second, and the application would spend the whole hash on counting hashes. That is, it would not have time to do anything else. And with such an acceleration, it will be quite enough to do something other than hashes. I do not think that this project will ever receive such a load of 100 thousand requests per second, but nevertheless, I find the experience quite useful.
By the way, what about python?
Python was mentioned at the beginning of the article, so why not try it too, since it was still at hand? There, as I said, adler32 can be counted right out of the box. Something like this will be the code:
The average result of three starts:
2.100 seconds. The best result: 2,072
No, this is not a mistake and the comma is not confused anywhere. Apparently, the thing is that since this is part of the standard library and in fact just a wrapper over the GNU zip S-shny, this gives an advantage in speed. In other words, this is not compared to Python and Rust, but to C and Rust. And C gets a little faster.
UPDPython also has the ability to use FFI, so here is a small addition to this issue, at the request of
ynlvko .
It was necessary to recompile the library under win32, since I have a 32-bit version of python:
cargo build --release --target i686-pc-windows-msvc
Code:
from ctypes import cdll lib = cdll.LoadLibrary("adler32.dll") for i in range(5000000): lib.adler(b'- , ')
The average result of three starts:
6.398 seconds. The best result: 6.393
That is, it turns out that the python FFI works several times more efficiently than node-ffi and even more efficiently than the “native” addons.
findings
Technology | Average time, with | The best time to |
---|
Node.js | 41,601 | 40,206 |
Node.js + ffi + Rust | 27,882 | 26,642 |
Node.js (sha1) | 9,737 | 9,321 |
Node.js + C ++ Rust | 7,802 | 7,658 |
Python + ffi + Rust | 6.398 | 6.393 |
Rusty | 2.314 | 2.309 |
C / Python (zlib) | 2,100 | 2,072 |