
When designing the service architecture, you choose the tools that are most suitable for the tasks you solve. But to use them to the maximum, you need to find the most reliable and convenient driver. Of course, if you are programming in Python or, for example, PHP, finding the driver you need is not a problem, because for many years the developers have written all that have been tested for years and work stably. But if you are programming for node.js, it becomes a problem, drivers squeak, flow away and refuse to work stably.
In this article we will talk about the problems encountered when choosing drivers, and how they were solved.
LiveTex is a service that is used by more than 10 million visitors daily. As the memory of our application, we use the Redis database, in which more than two million keys are generated per day when processing 34,000 commands per second. We also use Beanstalkd as a task queue in which more than 10,000 tasks are processed every second. We want our service to work quickly and stably, we demand the same from the drivers.
Initially, we planned to use already written clients for our tasks, but trying to “tie” them to our service, we encountered many problems:
- The problem of chunks.
The fragmentation of the tcp packet data is not taken into account. This affects beanstalk_client, fivebeans, as well as node-amqp, a client for RabbitMQ, which we originally planned to use instead of Beanstalkd, and node-thrift for hbase.
- Offset responses to the request.
Under certain loads there was a shift in responses. At request 2, we received a response to request 1. That is, the system returned incorrect answers.
- Not designed for high loads.
With loads of over a million requests for Redis and more than 100,000 tasks for Beanstalkd with a payload of about 10Kb, many clients simply did not return the result and hung the entire system.
- No fallback
For our service, it was necessary that all tasks and requests queued for execution were not lost during the transfer, even if there was no connection to the network for some time.
Not finding a ready reliable client for Redis and Beanstalkd, we decided to write our own.
Any client, in general, is a tool for implementing service requests and getting results from it. It must establish a connection and provide data transfer between your application and the service used. Using the UML class diagram, this can be represented as follows:

')
The model is very simple and clear. It is under this scheme that our Node-Polina module works. So far, within this model, we have implemented a client for Redis and Beanstalkd.
Node-Polina. Redis-Client.
About driver features.
At the moment, the driver implements such commands to the Redis database as:
set, get, mget, incrby, incr, decr, setex, expire, keys, del, sismember, sadd, srem, smembers
. Due to the fact that we were not tied to a specific implementation of commands, and divided them by type into returning numbers, strings or arrays, it is very easy to add a new command to the driver. If necessary, you can write to us on gitHub on issues marked as enhancement, and we will add the necessary functionality. In addition, Node-Polina has support for using
connection pooling and
shard .
Using
Simple customer
Connection pooling
Shard
Node-Polina VS Redis VS nodejs-redis
Time comparison

From the graph it can be seen that at high loads Node-Polina copes with its work much better. These graphics are relevant for node version 0.10.15. We also tested drivers for version 0.8.24, in which even at high loads Node-Polina continues to give results to requests when the node_redis is no longer able to cope with this.

When processing data in 10Kbytes at low loads, the difference between Node-Polina and other drivers is almost imperceptible. With requests of around 300,000 - 450,000, processing requests using Node-Polina takes more time, since the failover mechanism is implemented inside the driver. With loads of more than 230,000 requests, Node-Polina is again faster.
Memory comparison

In terms of memory consumed, the Node-Polina is almost comparable to node_redis, but on large loads, again, it requires more to implement failover.

When processing data in 10Kb, Node-Polina is obviously better than Hiredis. Also, it is clear that Polina is more stable than node_redis, her schedule increases evenly, without significant fluctuations.
Having written our driver, we solved the problems with the fragmentation of tcp-packages, the offset of responses to requests and the problem of falling with an error during a short-term disconnection. We wrote Node-Polina so that it could withstand high loads, was easy to use and had the ability to quickly and easily add functionality. In addition, our driver provides connection pooling and sharding, which is often necessary when working with the Redis database.
Sources can be found in our git-repository:
https://github.com/LiveTex/Node-PolinaInstall via npm using the command:
npm install livetex-polina