Tarantool is a great high-performance no-Sql solution developed by Mail.Ru.
SourcesThis solution allows you to use both the key / value mode and the selection of multiple records into a recordset by one or several criteria (search fields). Analogs in runet and not only, I have not yet met. With a stretch you can compare radishes. But in radish - list data and can not be selected by key. Judging by the claims of the developers, the access speed by the key exceeds memcache, while in the badground it is constantly saving data to disk. But unfortunately, this development has a single perl client for accessing data, which is why it does not have such popularity as, for example, in redis or memcache.
In the doc / box-protocol sources there is a description of the Protocol, which I have currently revised to write the client in C and PHP. Having studied the Protocol, you can implement a native client in your favorite language. I hope this article is useful to you.
')
All data in this database is divided into namespace, most likely it is an analogue of the database in MySQL. The numbering of all neymspeysov - digital (0, 1, 2, etc.). You can impose a specific index on each namespace. Indexing numbering is also digital. Indices are superimposed on one or more fields. The index can be of type HASH or TREE.
All Indices and Neymspaces are registered in Config. Below is an example where two indices are written, numeric and symbolic. Moreover, the second composite index is:
namespace[0].enabled = 1
namespace[0].index[0].type = "HASH"
namespace[0].index[0].unique = 1
namespace[0].index[0].key_field[0].fieldno = 0
namespace[0].index[0].key_field[0].type = "NUM"
namespace[0].index[0].key_field[1].fieldno = 1
namespace[0].index[1].type = "TREE"
namespace[0].index[1].key_field[0].fieldno = 0
namespace[0].index[1].key_field[0].type = "STR"
namespace[0].index[1].key_field[1].fieldno = 1
namespace[0].enabled = 1
namespace[0].index[0].type = "HASH"
namespace[0].index[0].unique = 1
namespace[0].index[0].key_field[0].fieldno = 0
namespace[0].index[0].key_field[0].type = "NUM"
namespace[0].index[0].key_field[1].fieldno = 1
namespace[0].index[1].type = "TREE"
namespace[0].index[1].key_field[0].fieldno = 0
namespace[0].index[1].key_field[0].type = "STR"
namespace[0].index[1].key_field[1].fieldno = 1
It should be noted about the keys. The keys can be digital (1,2,3 ... 6.2 * 10 ^ 9) or symbolic.
All data in neymspeys stored in the form of tuples. A card is a collection of fields. The field can be either numeric or symbolic.
The exchange between the client and the server is carried out by Messages. All Messages in tarantool Protocol are divided into Request Request and Response Response. Each Message has a mandatory Header Header and may also have a Body.
The Message Header includes: Message type, body length and request ID.
Header Structure:
<blockquote> typedef struct {
uint32_t type;
uint32_t len;
uint32_t request_id;
} Header;
</ blockquote>
The following message types are defined:
INSERT 0xd (13)
SELECT 0x11 (17)
UPDATE 0x13 (19)
DELETE 0x14 (29)
PING 0xff 0xff 0x0 0x0 (65280)
The request ID is set by the client and may be null.
General structure of the request:
typedef struct {
Header header;
union {
InsertRequest insert;
SelectRequest select;
UpdateRequest insert;
DeleteRequest insert;
};
} Request;
The PING team has no body, therefore there is no PingRequest;)
The body of an INSERT command consists of a namespace number over which the operation will be performed, a flag and a tuple.
namespace - This is the space in which tuples are stored. Namespace numbering is digital. In each namespace, indices can be defined. Primary index (PRIMARY) should
be sure to attend. Indices are defined in the configuration file.
Currently, only a single flag BOX_RETURN_TUPLE (0x01) is defined, which indicates whether to return data in the response body.
INSERT request structure:
typedef struct {
uint32_t namespaceNo;
uint32_t flag;
Tuple tuple;
} InsertRequest;
All data is described using Tuple tuples. A tuple consists of a cardinality field, which is the dimension of the tuple (number of fields) and an array of fields. In general, it will look like this:
typedef struct {
uint32_t card;
Field field [];
} Tuple;
Each field is represented by an array of bytes. A field can have: int32, int64 or stream bytes.
Currently, I have so far defined as:
typedef Field u_char * data;
All field data is packed using LED128
en.wikipedia.org/wiki/LEB128The body of the SELECT query includes: the namespace number, the index number to be sampled, the offset offset and the output limit size and the number of tuples and the tuples themselves. Parameters offset and limit are similar to the sample: SELECT * FROM ... LIMIT
typedef struct {
uint32_t namespaceNo;
uint32_t indexNo;
uint32_t offset;
uint32_t limit;
uint32_t count;
Tuple tuples [];
} SelectRequest;
If we specify SELECT * FROM t0 WHERE k0 = 1, then the number of tuples = 1 and the value of Tuple must match 1. If the secondary composite index k1 (numeric field and character) is defined, then the query
SELECT * FROM t0 WHERE k1 = (21, 'USSR')
number of tuples = 2 and two Tuple values must be present. It is necessary to make an explanation that the presented sql is schematic, and does not comply with the SQL'92 standard. The point is that the data in tarantool / box is represented by tuples, not tables (columns and rows). A tuple can contain any number of fields. All tuples are stored in Nemespace. However, you can set up a HASH or rbTREE index on the search space.
The body of the UPDATE request includes: namespace number, flag, tuple, number of operations and operations themselves. The flag and tuple fields are similar to UPDATE operations. The number of operations can be equal to zero. The structure will be as follows:
typedef struct {
uint32_t namespaceNo;
uint32_t flag;
Tuple tuple;
int32_t count;
Operation operation [];
} UpdateRequest;
Each Operation is a structure containing the field number for which the operation will be performed, the operation code and argument.
Operation codes are used:
0 - assigning an argument to this field.
If the argument is an int32 type, then the following actions are also possible:
1 - add an argument to an existing field
2 - execute AND with existing field
3 - perform an XOR with an existing field
4 - perform OR with existing field
typedef struct {
int32_t fieldNo;
int8_t opcode;
Field arg;
} Operation;
The DELETE operation is always performed on the primary key and contains the namespace number and tuple. The structure of the DELETE operation is presented below:
typedef struct {
uint32_t namespaceNo;
Tuple tuple;
} SelectRequest;
Each server response contains: Header header, response code and, if necessary, response body. The response header is similar to the request header. Return code 0 - success, or see errors in include / iproto.h
In general, the following response structure is obtained:
typedef struct {
Header header;
int32_t code;
union {
SelectResponce selectBody;
insertResponce insertBody;
uint8_t * data;
int32_t count;
};
} Responce;
The body of the response to the SELECT query consists of a field containing the total number of tuples and the set of returned tuples. If the query result is empty, then the tuples are not returned and the number field contains zero.
typedef struct {
int32_t count;
FqTuple tuples [];
} SelectResponce;
Each returned tuple (FqTuple) contains the size of the tuple, a certain identification cardinality, which acts as a separator (boundaries) and the tuple itself.
typedef struct {
int32_t size;
uint32_t card;
Tuple tuple;
} FqTuple;
If the BOX_RETURN_TUPLE flag is set in the InsertRequest request, the response may contain the body:
typedef struct {
int32_t count;
FqTuple tuple;
} InsertResponce;
Similar is the response to the UPDATE request.
Delete request returns the number of deleted records. Since during deletion only the primary index is used, we can delete only one record, respectively, it will return 0 or 1. This is the count field of the Responce structure. The structure also contains an array of data bytes for data analysis.
There may be some inaccuracies in the text, somewhere you can use int32_t instead of uint32_t.
Maybe something I misunderstood, so I will be happy business criticism from the authors of this wonderful project.