
I want to share the experience of studying Tarantool. I will not write about all the advantages and features of the Tarantula itself, there have been many articles on this topic (for example,
this ,
this and
this ). This post tells how to start working with a tarantula and about some of the features and goodies that can be obtained from the box.
Install
Actually, the first thing I encountered was the installation. Since I had to install it for tests on MacOS, then, most likely, very few people will come to this, but nevertheless. The package that was offered on the site was not established either because of some kind of dependencies, or because the system has already experienced more than one experiment. Therefore, I decided to collect from source.
')
The installation process is well described in the README. Do not forget to download submodules, if the source is downloaded from git. Even when building under MacOS, you do not need to be afraid of the fact that not all tests pass - the documentation says that this is normal.
If you want to get a console client, cmake must be run with the DENABLE_CLIENT = true key. Actually, after make we get the server and the client, if asked, src / box / tarantool_box and client / tarantool / tarantool, respectively.
Configuring the server
For example, you can take the configuration of one of the tests, for example, test / box / tarantool.cfg
One of the important parameters is slab_alloc_arena - this is the amount of memory used by the Tarantula. I advise you to study this parameter in more detail. It is also worth paying attention to the rows_per_wal, so you do not wonder why so many small files lie)))
Now we proceed to the most interesting. A tarantula only needs to know about the indexes, and he absolutely doesn’t care what will be stupid and what size it will be. Actually, in the config we describe only indices. In more detail types of indexes can be studied in the documentation. From the main point: when choosing an index, you need to understand exactly what it is for. HASH-index can not be non-unique. TREE indexes are good to use for organizing a sorted list by non-unique values. Also, indexes can be composite. Indexes are described for each space. Spaces can be a lot.
Total: Imagine that we need a space with 5 fields. The first field is non-unique, with the first + second field unique; we will do point selections on them. In the fourth field there is a certain parameter for sorting. Total build a unique index:
space [0] .index [0] .type = "HASH" # index type
space [0] .index [0] .unique = 1 # sign of uniqueness
space [0] .index [0] .key_field [0] .fieldno = 0 # record number in bluntly
space [0] .index [0] .key_field [0] .type = "NUM" # data type
space [0] .index [0] .key_field [1] .fieldno = 1
space [0] .index [0] .key_field [1] .type = "NUM"
We build an index to fetch a pack of non-unique records by the first field:
space [0] .index [1] .type = "TREE"
space [0] .index [1] .unique = 0
space [0] .index [1] .key_field [0] .fieldno = 0
space [0] .index [1] .key_field [0] .type = "NUM"
Build an index to select sorted records by the first + third field
space [0] .index [2] .type = "TREE"
space [0] .index [2] .unique = 0
space [0] .index [2] .key_field [0] .fieldno = 0
space [0] .index [2] .key_field [0] .type = "NUM"
space [0] .index [2] .key_field [1] .fieldno = 3
space [0] .index [2] .key_field [1] .type = "NUM"
Well, do not forget that the space should be made active:
space [0] .enabled = 1
Now we will describe another space where the stupid of two fields will be stored: the first is the unique, the second is not. Typical key value storage in its simplest representation:
space [1] .enabled = 1
space [1] .index [0] .type = "HASH"
space [1] .index [0] .unique = 1
space [1] .index [0] .key_field [0] .fieldno = 0
space [1] .index [0] .key_field [0] .type = "NUM"
Actually, with these settings, the Tarantula is ready for work. It is necessary to initialize the repository - and go.
> ./src/box/tarantool_box --init-storage tarantool/src/box/tarantool_box: space 0 successfully configured tarantool/src/box/tarantool_box: space 1 successfully configured tarantool/src/box/tarantool_box: creating './00000000000000000001.snap.inprogress' tarantool/src/box/tarantool_box: saving snapshot './00000000000000000001.snap' tarantool/src/box/tarantool_box: done
As you can see, he created all the files in the folder from which we launched Tarantula. If you want to change it, then in the settings there is a wirk_dir parameter, which can be determined at will.
After that we start the server:
> ./src/box/tarantool_box --background > ps xa | grep tarantool 5627 ?? Us 0:10.55 tarantool/src/box/tarantool_box --background
Hooray! You can start filling with data and extract them in the desired sequence and according to the necessary criteria.
Getting started
How to use the console client and paint every command in detail now I will not - this could be the topic of a separate article. And now I’ll focus on the Lua procedures. One of the interesting, in my opinion, features is the built-in procedures. With their help, you can make some "black box" with business logic, which can be easily and independently of the rest of the code can be changed, thereby separating the technical part from the business model. I think that Lua is a very good hint to the fact that business logic will be stored there.
So, at the start, the tarantula is trying to load the init.lua file, into which we will add our functions.
When writing functions, special attention should be paid to data types that are not present in Lua and, for example, not in Perl, but because of the implementation features of protocols, Perl numbers in Lua do not come in the same way as they come from the console client. So, while not supporting the data types in the adapters to the Tarantula, you can always transfer strings and where you need to convert them into numbers.
We write a procedure that will change one field depending on the value of another field and the current date. A very standard task, which, as a rule, is “blurred” in the code, and at the next refactoring an error occurs with the calculation of dates in this logic.
function increase_score(id, id2) local id = tonumber(id) — local fid = tonumber(id2) if(id == nil or fid == nil) then — return false end
In total, it can be considered that this is an atomic action from the point of view of an external system.
Now a few words about performance. The call frequency of this procedure has reached 20,000 rps. In this case, the pebble load was 67%.
Further I will tell about one more delicacy. The update with the calculation described above is good, but in addition to the update, as a rule, the task says that these data must be retrieved and, moreover, must be given in a sorted order. In order not to take out the sorting to the external system and not to do the sorting ourselves, let's use the indices.
function get_top(uid) local id = tonumber(uid) — if(id == nil) then
So we created a procedure that, taking advantage of the charms of the index, returns to us a list of sorts sorted in descending order. Regarding the sorting order, you can search the documentation box.index.LE and see what is different and how it works.
And most importantly: from the client, these procedures are called like this:
> lua increase_score(1,2)
and
> lua get_top(1)
In the next article I will write how all this good can solve one of the most common tasks and show the peculiarities of using the driver for communicating with the Perl Tarantula.