Benchmarking with Basho Bench

The guys from Basho team, in addition to their Riak NoSQL database and rebar build utility, did another useful thing - Basho Bench , a benchmarking utility.

Basho Bench was initially positioned as a utility for testing the performance of key-value-storages, but in the course of its development, it turned out by itself that other applications could be tested with it.

Quick start .
')
Basho’s main repository uses Mercurial’s version control system, but since I’m used to GIT, I’m using their Github repository:

git clone git: //github.com/basho/basho_bench.git
cd basho_bench
make all rel

To perform a performance test, it is enough to write a test configuration file, set an Erlang binary basho_bench on it, and, waiting for the tests to finish and run the make results command, analyze the resulting ugliness graphs:

./basho_bench my_config.conf && make results

(The picture is clickable)

In the configuration file, a driver written in Erlang is connected (a small one, moaning one and a half lines of add-on code over the required client library), the number of threads that perform the necessary actions, the number of operations per second are set, and the data generators are set up for testing.

Here is an example of a typical configuration file:

{mode, max}.
{duration, 15}.
{concurrent, 10}.
{operations, [{get, 1000}, {put, 10}, {delete, 1}]}.

{driver, Basho Bench_driver_riakclient}.
{code_paths, ["deps / stats",
"/ home / ubuntu / riak / apps / riak_kv",
"/ home / ubuntu / riak / apps / riak_core"]}.

{key_generator, {sequential_int_bin, 35000000}}.
{value_generator, {exponential_bin, 256, 10240}}.

{riakclient_nodes, ['riak@10.242.78.144']}.
{riakclient_mynode, ['Basho Bench@10.242.78.144', longnames]}.
{riakclient_replies, 1}.

The list of drivers, however, is small: three storages for Riak (i.e., the storage performance is tested on the local machine, without distribution, consistency, etc. - only low-level operations) and three protocols for the Riak database itself (here will be all of the above). However, having rummaged on Github'e , it is possible to find some more ready decisions. Finally, if the driver you need was not found there, it would be very easy to write your own database with, say, a ready-made client for the database under study.

Some personal experience.

Very cool Basho Bench helps with hacking and optimizing any product. Let's say you see some ugly / slow / wrong construction that you can rewrite in three ways. Which one is better and more effective is difficult to say without thinking. You can test everything by setting the run through the Basho Bench for 5-10 minutes for each option. The results will provide basic statistics: the average time per operation over the test time, as well as data for each type of operations separately: rms, arithmetic average and percentiles (all values are also dependent on time).

So, by testing Riak on Amazone EC2 instances, at the expense of Basho Bench, I was able to save a lot of time by trying many different combinations of cluster configurations, Riak and instances themselves. At the same time, the data in the tests were as close as possible to the combat conditions, and the maximum load was created.
There were also several tests of MongoDB with a self-written driver based on the emongo library, which took about half an hour to write. In the near future, I plan to base Basho this driver, and at the same time a driver for MySQL based on a self-written client.

Of course, the capabilities of Basho Bench are not limited to the same databases. You can actually test everything: web servers, parsers, evaluators, etc. A small set of initial features is compensated by a convenient architecture. Any component is easy to expand. The existing key or value generators do not fit the task? You can write a new one in just a few minutes. Do you want to send additional parameters along with the type of operation? A couple of hours of work - and you can enter the necessary MySQL queries directly into the config, without getting into the Erlang-code.

A spoon of tar.

By itself, and Basho Bench has flaws. The main one is R , which is required for generating graphs. With all its potential power, it is complex and often requires dancing with tambourines during installation and configuration.

And you want to customize. For example, if you check not 2-3 operations, but a dozen, you want to rationally place the graphics on the picture. Not to mention the default resolution - 1024x768. In addition, sometimes R very short-sighted draws the curves themselves, criminally leaving a lot of free space on the chart.

However, you can use your own graph generator: all test data is stored in regular CSV files. Or modify the script generation files with a file if you know R.

Another kind of drawback, though controversial, is that for active use it is very desirable to know Erlang, at least along the tops. On the other hand, active contributing will more or less eliminate this need.

Instead of an epilogue.

Anyone who tests a lot of databases for performance is highly recommended. It is recommended twice to people who know and use Erlang: you can easily test any components of your system.

And most importantly - saving a lot of time and effort applicable to solving new, even more interesting and challenging tasks.

Successful benchmarks!

Source: https://habr.com/ru/post/103444/

All Articles

Benchmarking with Basho Bench

More articles: