NoSQL era behind

New trend on HighLoad ++ - a lot of reports on the use of RAM. A word to Konstantin Osipov, developer of the Tarantool platform, author of the report “What is special about DBMS for in-memory data” .

You were responsible for performance in MySQL, how did it happen that you decided to develop your DBMS?

In MySQL, I managed one of the server development teams, everyone was responsible for the performance.

MySQL in many ways was a dream job, but unfortunately after we became part of Oracle, a lot has changed.
')
Some of my colleagues went to MariaDB, someone started their company (SeveralNines, FromDual). I have never felt “underloaded”, and with the departure of many key developers, work has generally turned into a knowledge transfer marathon. Resistance to absorption, a desire to start everything from scratch, a revolt against slow decision-making by a large company, a reluctance for various reasons to leave for the United States, after all, a good offer from Mail.Ru, which by this time was already about a year - and I left.

If I knew where I was going, I would have thought ten more times. Sometimes there was no belief at all that they would be able to do something useful than they would use outside Mail.Ru, and even now Tarantool is very far from the “ideal DBMS”.

Why is Tarantool not just a DBMS, but a platform? What is the chip do platform?

We just do what makes sense. If for classic DBMS, optimization of work always goes around the disk subsystem, for DBMS in RAM, the network becomes a bottleneck in performance. 100,000 requests per second with a request size of 1 KB - and already the 1GB card strip is 100% full. When working with memory, 100,000 requests can be given by one Tarantula, utilizing 1 core. And in a modern machine there can be dozens of cores. Therefore, we make an application server so that calculations can be done not only on the client, but also brought to the data.

The second consideration is that many products simply duplicate each other’s functionality. For example, today many people use Radishes as a replacement for Memcash, just to have a smaller “zoo” of solutions. Technology under the hood there is almost the same. The platform allows you to replace several solutions at once, for example, recently we made the Memcached plugin, which implements the Memcached binary protocol for the Tarantula.

As a bonus, the user gets all our other features - master replication and plug-in storage engines, for example.

Where is the world of databases going? Reviving NoSQL - is this a well forgotten old? Why now? What will happen next?

This is a big topic of conversation, many parallel trends are developing at once. For example, at the same time there is a “specialization” - niche tools appear that solve a narrow series of tasks very qualitatively, and generalization - at some point the community gets tired of the zoo and stops at one.

I think in general we can say that the era of NoSQL is over. All NoSQL solutions add SQL, just not all have yet to do. Develop SQL extensions to work with specialized types of data - graphs, for example.

On the other hand, there is still a very large body of tasks for which there are no standard declarative languages at all - everything related to big data and the search for knowledge. I think that convergence on this front will await us over the next decade.

From the “iron” trends, I think, in the coming years, the ARM platform will be developing very seriously, it is worth looking at least at Cavium products, Scaleway cloud hosting, and in many respects ARM-based global automation of the “offline” environment.

In the field of business, it is already clear to all that cloud technologies are becoming ubiquitous. For us as a vendor, this is very important - we will have to change the means of “delivering” the product to the consumer.

If today we just give packages for various popular distributions, then tomorrow we will need to support a lot of cloud platforms - starting with just an image for Docker and ending with one-click-install in some Microsoft Azure or Heroku. There is a risk that the situation with clouds will become similar to the situation with the availability of large supermarket shelves for small farmers, although this is not the case at all.

RAM is becoming cheaper and more productive, which allows it to be used to store the working data set of an increasing number of applications. Storing all data in RAM allows you to make them highly accessible, and the algorithms for working with data either simplify significantly or speed up, and sometimes both.

What is special about in-memory DBMS and how does the work of such a program differ from an arbitrary high-load system written in C, C ++, Java?

In my report, I will talk about specialized algorithms and data structures for storing data in RAM:
- Memory allocation without compromise: why this is possible only in the DBMS
- Hashes and associative arrays: how to make them not only fast, but also compact
- How can a competitive update of the same data in the memory without locks be realized

“Bottlenecks” in a DBMS in memory are so significantly different from their counterparts in “classic” DBMS, that simplicity and elegance are a necessary condition for survival. The fight goes for bytes and instructions, and complex code simply cannot work effectively. I’ll tell you how simple in-memory transaction processing solutions can simplify and speed up replication, roll back abort transactions, support for advanced features such as triggers, and data schema changes.

Dmitry Kalugin-Balashov continues the topic with the report “How to choose an in-memory NoSQL database wisely? Testing performance . We adore such reports - Dmitry conducted tests of such NoSQL solutions like Memcached, Redis, Tarantool and CouchBase and will present the results of this testing at the conference.

We do not know that Dmitry chose, but one of the best choices is the Tarantool platform, the DBMS and the application server in one bottle. The Mail.Ru team will talk about how such projects as Mail, Ratings and Clouds were transferred to Tarantool.

The introduction to the Post has allowed the company to save a million dollars - the technical director of Pochitasse Mail Denis Anikin tells us .

Vasily Soshnikov (@Mail.ru.ru's rating) and Andrey Drozdov (Tarantool) will talk about such an architectural pattern as building services based on Nginx and Tarantool. An educational report, taking steps to explain the logic of the pattern. By the way, Tarantool has an upstream module for Nginx.

Anton Reznikov and Vladimir Perepelitsa (Cloud.ru) will talk about the implementation of the concept of microservices on top of Tarantool. Yes, Tarantool is another NoSQL database, but still it is a full-fledged application server. Applications located next to the data!

- No no no! Tarantool is too much for us, is there anything else from NoSQL? - you ask us.

Yes there is. The report of Vladimir Akritsky "NodeJS in HighLoad project" .

Over the past year, we have developed the DMP project (Data Management Platform) using NodeJS for prototyping. At the moment, the project is still mostly on JS and easily copes with current loads of 10,000 requests per second.

In the report I will tell why we stopped on NodeJS, although I chose between .NET, Go, NodeJS, Python and Ruby. Why we do not regret our choice and we will continue to use it for some projects.

Interesting?
Come to HighLoad ++ , we have less than a week before the conference!

And finally : For the users of Habrakhabr, the conference offers a special discount of 15%, all you need to do is use the code " IAmHabr " when booking tickets.

And the very last thing : The conference is already next week and we will write less often - we will sleep and rest. But then we will come back, you can find new publications in the blog on Habré and in our free newsletters . Looking forward to stay in touch!

Source: https://habr.com/ru/post/269657/

All Articles

NoSQL era behind

More articles: