Do you know such an architecture? A round dance of demons dancing between web-server, cache and storage.

What are the disadvantages of this architecture can be noted? Solving problems within such an architecture, we are faced with a bunch of questions: which language (and?) To take, which I / O framework to choose, how to synchronize cache and storage? A bunch of infrastructure issues. Why solve infrastructure issues when you need to solve a problem? Of course, we can say that we like certain technologies X and Y, and translate these minuses into ideological frameworks. But it is impossible to deny the fact that the data is located at a certain distance from the code (picture above), which adds latency, which can reduce the RPS.
')
The purpose of this article is to talk about the alternative, which is built on the basis of Nginx as a web server, bancer and Tarantool as an App Server, Cache, Storage.
Improving cache and storage

Tarantool has some interesting properties. Tarantool is not only an effective inmemory DB, but also a full-fledged Application Server, applications are written in Lua (luajit), C, C ++, i.e. You can write the logic of any complexity, one restriction: fantasy. If there is more data than the available memory, part of the data can be stored on a disk using the Sophia engine. If Sophia is not suitable, you can take something else and throw off the “cold” data, i.e. data that is not needed right now, from Tarantool to another Storage, and the “hot” part to be stored in Tarantool, i.e. in mind. What advantages does it give us?
- No intermediaries. At least the hot part of the data is on par with the code.
- Hot data in memory.
- The code is fairly simple and easily updated if we are talking about Lua.
- Transactions, replication, sharding, and many other features Tarantool.
Improving web-server

The end user of the data is the user. Usually, the user receives data from the Application Server via Nginx as a balancer / proxy. The option of writing a demon that can communicate with both Tarantool and HTTP is not appropriate, since it will lead us to the first drawing, and we will again return to what we started from. Therefore, we will try to look at the situation from the other side and ask another question: “How to get rid of the intermediaries between the data and the user?”. The answer to this question was the implementation of the Tarantool Nginx Upstream Module.
Nginx upstream
Nginx Upstream is a persistent (see Upstream Keepalive) connection via pipe / socket to the backend, further we will call it “proxying”. Nginx provides a lot of diverse functionality for writing Upstream rules, for proxying HTTP in Tarantool, the following features are of particular importance:
- the ability to specify multiple backend, which Nginx will load balance;
- ability to specify backup, i.e. indicate where to go if upstream is not working.
These features allow you to:
- distribute the load on N Tarantool, for example, along with sharding, you can build a cluster with uniform load on the nodes;
- You can make a fault-tolerant system using replication;
- using clause a) and clause b) we obtain a failover cluster.
Sample config for Nginx, partially illustrating the settings:
More details about configuring Nginx Upstream can be found here:
http://nginx.org/en/docs/http/ngx_http_upstream_module.html#upstream .
Nginx Tarantool Upstream Module (v0.1.4 Stable)

The main functionality:
- the module is activated in Nginx.conf by the directive - tnt_pass UPSTREAM_NAME;
- fast streaming conversion HTTP + JSON <-> Tarantool Protocol, minimal blocking (for the duration of parsing) Nginx worker;
- non-blocking I / O Nginx in both directions;
- as a nice bonus: all features Nginx, Nginx Upstream;
- the module allows you to call Tarantool stored procedures via JSON-based Protocol;
- data is delivered via HTTP (S) POST, which is convenient for Modern WebApps and not only.
Input data
[ { "method": STR, "params":[arg0 ... argN], "id": UINT }, ...N ]
"Method"The name of the stored procedure. The name must match the procedure name in Tarantool. For example, to call the lua function
do_something(a, b)
, you need:
“method”: “do_something”
"Params"Arguments stored procedure. For example, to pass arguments to the lua-function
do_something(a, b)
, you need:
“params”: [ “1”, 2 ]
"Id"Numeric identifier set by client.
Output
[ { "result": JSON_RESULT_OBJECT, "id":UINT, "error": { "message": STR, "code": INT } }, ...N ]
"Result"The data that the stored procedure returned. For example, the lua function
do_something(a, b)
returns
return {1, 2}
then
“result”: [[1, 2]]
"Id"The numeric identifier set by the client.
"Error"If an error occurs, this field will contain the reasons.
More details about the protocol here:
https://github.com/tarantool/nginx_upstream_module/blob/master/README.mdHello world
Run Nginx
Nginx we will collect from the source:
$ git clone https://github.com/tarantool/nginx_upstream_module.git $ cd nginx_upstream_module $ git submodule update --init --recursive $ git clone https://github.com/nginx/nginx.git $ cd nginx && git checkout release-1.9.7 && cd - $ make build-all-debug
The goal of the build-all-debug is the debug version. We do so in order to configure Nginx less. For those who want to configure everything from scratch, there is a goal for
build-all
.
File
test-root/conf/nginx.conf
http {
$ ./nginx/obj/nginx
Run Tarantool
Tarantool can be delivered from packages or assembled.
hello-world.lua
If you put Tarantool from packages, you can start it like this:
$ tarantool hello-world.lua
Call the stored procedure
You can call echo stored procedure with any HTTP connector, all you need to do is HTTP POST at 127.0.0.1/echo and send the following JSON in the body (see Input):
{ "method":"echo",
I will call this procedure wget'om
$ wget 127.0.0.1:8081/echo --post-data '{"method":"echo","params":[{"Hello world": "!"}],"id":1}' $ cat echo {"id":1,"result":[[{"hello world":"!"}]]}
Some more examples:
https://github.com/tarantool/nginx_upstream_module/blob/master/examples/echo.htmlhttps://github.com/tarantool/nginx_upstream_module/blob/master/test/client.pyLet's sum up
Advantages of using Nginx Tarantool Upstream Module:
- no intermediaries, code and data, as a rule, at the same level;
- relatively simple configuration;
- load balancing on N Tarantool;
- high speed, low latency;
- JSON-based protocol instead of binary, no need to look for Tarantool Driver, JSON is everywhere;
- Tarantool Sharding / Replication and Nginx = cluster solution, but this is a topic for a separate article;
- solution is used in production.
Minuses:
- Overhead JSON instead of the more compact and fast MsgPack;
- The solution is not boxed, you need to configure, you need to think about how to deploy.
Plans:
- OpenRsty and nginScript support;
- WebSocket and HTTP 2.0 support.
The benchmark results, and they are very interesting, will be in another article. Tarantool, like the Upstream Module, is always open for new users, if you want to try it all, use it or express a new idea - contact github, google group.
Links
Tarantool website -
http://tarantool.orgGit Tarantool -
https://github.com/tarantool/tarantoolGit Tarantool Nginx Upstream Module -
github.com/tarantool/nginx_upstream_moduleGoogle group -
https://groups.google.com/forum/#!forum/tarantoolPS In the next article I will show you what tasks can be solved using Tarantool.