
Tarantool is an open source database. Anyone can download it from GitHub and use it both in commercial and non-commercial applications. Today, the technical director of the Mail.ru Denis Anikin will talk about examples of using this database. The material is based on the speech at the
conference FailOver Conference .
Tarantool has been developed by Mail.Ru Group for more than seven years. This DBMS is designed for highly loaded systems. Its main difference is that it is a database that combines the properties of a real database — transactions, replications, everything about reliability — but at the same time it is as fast as caches, for example, Memcached or Redis.
In Mail.Ru Group, a good half of the products work on Tarantool. It is preferred in cases where the cache requires the properties of the cache, that is, it must be able to do 100 thousand updates per second, it must have very good latency — 1 ms or less — and so on. Many DBMSs do not satisfy these criteria. If many shards are used, this is not beneficial: transactions stop working, integrity is lost, and other problems arise. And caches, in turn, do not have many of the useful properties of the database: the reliability of storing data on disk, transactions, and so on. For example, in caches there is usually no such important thing as stored procedures. They allow you to transfer logic to the data storage side.
')
DB + cache =?
Of course, you can live on high-loaded projects and without Tarantool. Suppose you need the properties of both the base and the cache in one bottle. Then you can use a very popular scheme: put the cache on top of the database. Then all requests go to the cache first, if there is the necessary data in it, then they are given to the user. If they are not there, then the request is redirected to the database. All updates go immediately to the database and to the cache, because we cannot store something in the cache without saving it in the database, otherwise this data may be lost.
This scheme allows you to get some of the properties of the database. For example, there will no longer be any transactions or storage procedures, because when two systems appear, especially the cache, then we are not talking about any transactions. In a sense, replication is also lost, because the data in the database is replicated, and in the cache - as if not quite. Also, stored procedures and other properties of the database are lost. Cache properties are also partially saved. Such a system works faster, because it increases the speed of processing read requests, but at the same time, for example, a write request does not become faster. If the database has slowed down, because it has a table vacuum or something else, then the system will slow down to write, because recording does not work without a database.
In general, this is a working scheme. It allows you to get some of the properties and the base and cache in one system. If this is enough for you, then this scheme should be used.
True, in this case, there are two frequent problems - inconsistent data and a cold start. “Inconsistency” means that the data in the cache and database may be different, because the cache and the database are not a replica of each other, they are just two separate entities. “Cold start” is a situation when at the start of the cache it is still empty, there is no data in it, therefore all requests fly to the database and the system performance leaves much to be desired.
If you are not confused by these moments, then the “cache on top of the database” scheme is quite a working option. Otherwise, it is advisable to pay attention to Tarantool, because in it all these problems are solved initially. One of the reasons for its development is not to fence such complex heterogeneous systems consisting of several repositories, but to calmly manage with one and store all the hot data in it.
Engines
Tarantool has two data storage engines. One of them is an in-memory engine. It is arranged like this: all data is stored in memory, copies of the data are on the disk. Each transaction is written to the disk simply in a log, and from time to time the entire snapshot of the entire database is flushed to the disk. Reset asynchronously in the background. While it is being reset, the database is working, because all new updates go to separate places. That is, everything works without any brakes. Log transactions are always written.
The second disk engine. It allows you to store everything on disk. And you can use both SSD and hard drives. This engine grew from Google LevelDB, which has been optimized.
Advantages and disadvantages
Tarantool has cache specific properties:
- Hot data.
- Availability 99.99%.
- Optimum operation with high parallel load.
- Latency:
- 99% of requests <1 ms
- 99.9% of requests <3 ms
- Write load — up to 1 million transactions per second on a single CPU core.
- Do not need a lot of servers.
- Optimal use of memory.
- The system is constantly working, no need to take a break for maintenance work.
Basically, everything is connected with high speed of work. Traditional DBMS does not have these properties. At the same time, Tarantool also has the properties of classic DBMS:
- Persistence (reliability of data storage on the disk).
- ACID transactions.
- Replication (master-slave and master-master).
- Stored procedures.
- Non-blocking server scripts.
- Convenient backups.
- Run queries Cursors, Range and Full scan.
- Primary and secondary indices.
- Tables
Again, the cache does not have these properties, but they are present in Tarantool.
Some modern databases are aimed at high reliability of work, others focus on speed of work. These are two different worlds, which, basically, do not overlap. Tarantool is a fairly successful attempt to combine both worlds in one solution.

The disadvantages of Tarantool include the following:
- He has not a very large community yet. Before applying any technology, every person always thinks: “What if it will not work? Who will I ask? ” Tarantool developers try to be and answer questions everywhere: on Facebook, on Stack Overflow, and so on. But from the point of view of a number of users there is a risk of not getting an answer due to the small number of the community.
- There is no consistent sharding. We are working on this now, so that he is out of the box already normal, with transaction support.
- Proprietary protocol.
Tarantool usage examples
All examples are taken from the experience of the Mail.Ru Group projects. In fact, there are a lot of them, but we will consider only three: the authentication system, the push notification system and the advertising system. Usually they are the highest loaded.
Authentication system
It should have a number of seemingly contradictory requirements.
- Highest demand : Authentication should be checked for every hit on the website or in the mobile application. You need to check the password, cookies, token, anything, because you can not just let the user in. On the Mail.Ru portal and in mobile applications, the number of requests to the authentication system amounts to millions per second.
- Latency - the time between the request and the response from the database. It should be as small as possible, otherwise everything will slow down, including the web server using authentication. While it waits for a response from the database, it takes up some thread or process, the data hangs in memory, and this also consumes server resources. That is, the slow authentication authentication system can drag along a lot of problems, so it should work just instantly.
- High availability . If the authentication system does not work in 1% of cases, then the whole site will not work in 1% of cases.
- Constant requests to the repository . Each hit in the authentication system is a check of the session, password, token.
- Protection against brute-force attacks and fraud . Authentication system is constantly trying to break.
- Almost every call is associated with the execution of a transaction , that is, with the need to change some data. For example, when performing authentication, you need to check the entered data, update the time and place of authentication, other parameters for the brute-force protection system. All this is a direct transaction in the database. This record can not be lost.
- A lot of inevitable extra work . When everyone is trying to break the system, behind the scenes there are a lot of hits that are not generated by users, but by means of hacking. These calls do not carry any useful traffic or profits. This is an extra load. But they have to be processed and checked.
- Large data size . Naturally, the entire user base should be stored in the authentication system.
- Presence of validity of the data entered by the user (expiration) . For security reasons, if the user is not active for a while, his session is terminated. To do this, you must check the start time of the session and the presence of activity.
- Reliability of data storage (persistence) . Obviously, if the authentication system “forgets” a part of users as a result of the loss of credentials, this is a direct damage to reputation.
In general, this set of properties may look contradictory. Some of them are usually implemented caches, and some - databases. The authentication system must be reliable and durable as a truck, but at the same time as fast as a sports car. And Tarantool came in handy here.
The scheme of Mail.Ru authentication system by login and password:

Only when checking logins and passwords in Mail.Ru 50 thousand transactions per second are performed. The brute-force protection and authentication system is read and written in Tarantool every time. This total load reaches approximately one million requests per second: from the entire portal, from all mobile applications, from all Ajax and non-Ajax requests.

In this session, only 4 servers serve, and user profiles - 8 servers. Not some kind of branded, special servers, but the most ordinary ones, with ordinary processors. Nothing cosmic.
Push notification system
As you know, mobile apps like to send push notifications to users so that they stay longer. That is, it is such a good and appreciative thing.
How does the notification system in Mail.Ru?

When any events occur on the server - a letter has arrived, a message has been sent to the instant messenger, news has appeared - you need to send a notification to the mobile phones of the end users. Directly to make it impossible. Therefore, Apple and Google provide APIs for iOS and Android, through which you can pull mobile apps.
But you cannot access these APIs directly from the server. Why - about this below. In addition, each time you generate an event, you need to go to the repository to understand which user to deliver this event. And do not forget to read the token, because the API works with tokens. And all this needs to be done very quickly.
It is also extremely important to maintain very low latency, since events are generated from a large number of different contexts and server environments. You can never slow down on the server, wait a second or two, because otherwise all the other participants in the process along the chain will begin to work slowly. For this reason, we are accessing the API not directly from the server, but through a queue, also running on Tarantool. This DBMS can also provide a queue service, moreover, persistent and replicable. That is, when the machine crashes, when the disk fails, when the server is restarted, no one notices.
It turns out that the server works directly only with fast storage, and everything slow is located further, behind the queue. This scheme allows you to quickly process everything: the total number of requests in this system is about 200 thousand per second.
Advertising display system
This is probably the most highly loaded version of the use of Tarantool, and this is the largest farm that Mail.Ru Group has. The system is responsible for displaying ads on almost all pages of a huge portal, and ad units on a page are usually at least 10.
To show each ad unit, you need to understand what to show to the user. For this data about him are collected from several different sources. All this is aggregated and the result is formed. All this is done for each block, and less than one millisecond is needed for this. Where does this requirement come from? Because users do not want the service to slow down due to advertising, it annoys everyone, but is a necessary evil.

The load on the advertising display system is about 3 million requests per second. And 1 million of them are changes, because the display of advertising often leads to an update of the user profile.
Brief conclusion
If you need to combine the properties of the database and the cache, reliability and speed, and if you cannot achieve this with some simple methods that you usually do, then take a closer look at Tarantool. Most likely, he will solve this problem.