📜 ⬆️ ⬇️

Tarantool as an application server

Hi,% Habrayuser%. The Tarantula team continues to share insights and expertise to work effectively with data in high-load projects. Today we will try to figure out why Tarantool is “two in one”: not only the database, but also the application server. Probably, some have heard of the Tarantula as a super-fast persistent in-memory replication repository and storage on Lua. Imagine that we take Redis slices, add frozen Node.js, fill Go from above, and then cook, stirring slowly, for five minutes after boiling . It would seem, what have the Application Server?



Many will probably be surprised, but such great products as nginx, Go, Node.js, Redis, MongoDB, Tarantool and others have a lot in common in architectural terms. To create any high-performance network server, one way or another, a set of libraries is needed that provides non-blocking I / O, asynchronous event handling, memory handling, error handling, logging, demonization, etc., etc. This runtime usually represents is a rather complicated thing that requires a deep understanding of the basics of the work of various systems and skills of low-level programming.

We in the Tarantool team have come a very long way, having created our runtime with asynchronous event processing, non-blocking I / O, green threads in user-space (faybery), cooperative multitasking, a family of specialized memory allocators, etc., etc. It turned out something architecturally most similar to Go (fools have the same thoughts), but only in the form of libraries for pure C. Thanks to this basis, we managed to create a database that is capable of processing up to 6M requests per second on one core of an ordinary laptop and has the best memory footprint on the market. At some point, we decided to give unlimited freedom to application developers, allowing in stored procedures to do not only database queries, but also use our entire toolkit in full. Retrieving and parsing JSON's HTTP over a cloud service directly from the database is easy. Start your REST service directly in the DBMS - please. Go over the data on a dozen servers, processing requests in parallel - no problem! All features and tools in the hands of developers! Developers, developers, developers!
')

How to make X, Y, Z in Tarantool?




We are often asked if there are queues in Tarantool, data expiration, pub / sub, multiget, or something else from hundreds of Redis commands. No, in Tarantool there is nothing of this, it was not and never will be (weeds, caties). We abandoned this path and offer a slightly different approach. As you know, give a man a fish, and he will be fed one day, teach him to fish, and he will always be fed. Tarantool provides the tools ("fishing rod"), with which you can solve various problems, including those emerging from the category of patterns and patterns. Automatic deletion of old data - background file on Lua. Multiget from several tables by a certain condition - two lines of Lua. Return data in JSON format - upload the web server module directly to the database. Think out of the box !

Let's see what tools Tarantool provides:


In the role of the cherry on the cake is box - a super - fast multi-engine database with transaction support and multi-master replication, working directly in the same address space as the applications.

Interestingly, but we have a three-tier architecture?




“Is it really again proposed to transfer all business logic to the database, instead of a separate application server (Node.js, PHP, Python, whatever) and DBMS (Redis, MongoDB, etc.)?” You reasonably argue. No, we do not encroach on the foundations of the universe. Let's just look a little more pragmatically. At a time when space applications can already be written directly in the browser , the server side remains at most for data storage and processing, while the browser can request and update all the necessary information dynamically via AJAX. What then does your application server do (PHP / Node / Go / Python)? Idle waiting for a response from the base, then immediately give it all in the form of JSON in nginx? But you need to open a transaction, then over the network, extort data from the database, change some fields in the application server, send updates back to the database, close the transaction and return the result to nginx. How many extra network round trip and switchings of userspace <-> kernel <-> userspace will we spend? And yet, the database should support for you a complete read-view data for each open transaction, for example, at the cost of using locks or other equally heavy mechanisms. And all this in order to select five records from one label, update two records to another and return the result to our REST service?

For this kind of microservices, Tarantula offers to write the storage right next to the data itself, inside the database. The tarantool-http and nginx_tarantool_upstream module easily organizes a REST service from Tarantula, simplifying the service architecture and removing the unnecessary link as a dedicated application server. At the same time, no one offers to rewrite the entire application in this way, because you can select into microservice only the most loaded parts of the project, where the performance margin of traditional solutions is not enough. For the rest, you can use the same Tarantool as a general-purpose DBMS via connectors from different programming languages.

Fine, but there Lua ?!




“We do not know how to write on Lua! And where can we get such programmers? ”- you ask. Don't panic ! Lua (Portuguese. Moon) - simple as a penny language, does not require to work to study the collected works of authors in ten volumes. We looked at the examples instead of viewing ads in the subway - and you can already begin to rob code. In Mail.Ru Group, Lua's C / C ++ programmers as well as Python, Perl, Ruby and JS developers write procedures on Lua equally well. But why Lua? Tarantool gives the developer a real Turing-complete high-level programming language, allowing to solve any problems. Say your hard iron is not programming in PL / SQL , XML , YAML and ini-files (God forgive me). In addition, Lua is extremely simple and works very fast . With Lua, there is no dispute about what 10% of the language functionality is allowed to use in the project.

By the way, Tarantool also has an API that allows you to achieve unprecedented performance and, in theory, the ability to use any other languages. We will talk about this in more detail in the next series, do not disconnect.

And let's try!




Install Tarantool from the tarantool.org/download.html page. The repositories on the site have binary packages for the main Linux distributions, as well as ports for FreeBSD and brew for OS X. After installation, enter the tarantool command in the tarantool , which by default launches the interactive console (like Python, Node, irb, etc.):

 roman@book:~$ tarantool tarantool: version 1.6.8-123-gbe2ce21 type 'help' for interactive help tarantool> 

In the interpreter, you can enter an arbitrary Lua code, and the result of the execution will be displayed in a readable format (YAML) in the console:

 tarantool> 2 + 2 --- - 4 ... tarantool> { name = "Roman", gender = "male" } --- - name: Roman gender: male ... tarantool> print('Hello') Hello --- ... 

All the same can be written as a script in a separate file:

 #!/usr/bin/env tarantool print('Hello world!') 

Run the script in the same way as Bash, Python or Ruby:

 roman@desktop:~$ edit ololo.lua roman@desktop:~$ chmod a+x ololo.lua roman@desktop:~$ ./ololo.lua Hello world! 

Tarantool is fully compatible with Lua 5.1 and LuaJIT at the script level and can be used as a drop-in replacement. All modules from Lua work in Tarantool.

The box.cfg{} function configures and launches the embedded database (box), after which you can create tables (space) and execute queries:

 tarantool> box.cfg {} [cut] tarantool> space = box.schema.space.create('test') [cut] tarantool> box.space.test:create_index('primary', { type = 'tree', parts = { 1, 'num' }}) [cut] tarantool> box.space.test:insert({48, 'some data', { key = 'value', key2 = 'value2' }}) --- - [48, 'some data', {'key': 'value', 'key2': 'value2'}] ... tarantool> box.space.test:select() --- - - [48, 'some data', {'key': 'value', 'key2': 'value2'}] ... 

If you now stop the interactive console (via CTRL + D or os.exit(0) ), then in the directory you can see the new * .snap, * .xlog files. These files are used to ensure the persistence of our database in memory. tarantool restore all data:

 tarantool> box.cfg{} --- ... tarantool> box.space.test:select() --- - - [48, 'some data', {'key': 'value', 'key2': 'value2'}] ... 

And now let's try something more complicated (example from the main page):

 #!/usr/bin/env tarantool box.cfg{} --      box.once('schema', function() box.schema.create_space('hosts') box.space.hosts:create_index('primary', { type = 'hash', parts = {1, 'str'} }) end) --  GET-  / local function handler(self) --  IP-  local ipaddr = self.peer.host --         box.space.hosts:upsert({ ipaddr, 1 }, {{'+', 2, 1}}) --      JSON  return self:render{ json = box.space.hosts:select() } end local httpd = require('http.server') local server = httpd.new('127.0.0.1', 8080) server:route({ path = '/' }, handler) server:start() 

To run, you need the tarantool-http module, which can be supplied from packages or from GitHub. The script at the first start will create the hosts table, after which the HTTP server will be started, which on `/` will increment the counter for each IP address and return all the addresses in JSON format to the client. It is also easy to put in front of the nginx service. By the way, try.tarantool.org is written in Tarantool itself. We ourselves eat our cacti and try to make the life of developers better.

We roll out on production




How better to place our simple application on a working server? After all, it's one thing to play around in the console, and quite another to launch it all into battle. It's simple . We transfer the script to /etc/tarantool/instances.enabled/myapp.lua and run it through the ready-made utilities for init ( tarantoolctl start myapp or even service tarantool restart ). Works! Simply?

You can do as many applications as you like, the init system itself will launch the required number of Tarantool daemons and monitor them. We recommend running slightly less tarantulas than you have physical cores. This approach will provide the best performance per server and save millions of dollars . In the process list, you can easily find a demon with the name of your application. Log files are written to /var/log/tarantool/myapp.log by default, data are stored in /var/lib/tarantool/myapp/ , and the pid file is written to /run/tarantool . In other words, everything is exactly as intended in your favorite distribution. To build RPM and DEB packages you can use our template .

From the useful is worth special mention the command tarantoolctl enter myapp , which allows you to connect the console to a working daemon for introspecting the state and changing the code on the fly. Also via box.cfg({listen = 3313 }) you can open the network port for connecting with connectors from other programming languages ​​and frameworks (we promised you that we would not break the whole world order!).

What next?


In the next series, we will describe in more detail how to ensure the modular architecture of your application, organize testing and continuous integration of code. The secret know-how to prepare the Tarantula for processing up to 6M requests per second on a single physical core will also be revealed.

We are waiting for questions and comments.

Z. Y. On January 28, we will hold the second Tarantool Meetup at the super-modern office of the Mail.Ru Group at Airport . At the meeting there will be new insights from both our team and external users of the Tarantula. Admission is free after registration , all nishtyaki included. You have a good mood and desire to try Tarantool in your projects.

Source: https://habr.com/ru/post/272669/


All Articles