📜 ⬆️ ⬇️

Sync vs Async on the example of Firebird

image

In this publication I set myself several goals:


')
Further a lot of code ...


The test task looks very simple:
return query result
select * from rdb$relations 

in json format. On my test base, this will be a response of about 19 kb in size.

Let's write various options for the implementation of this task and, using Apache Benchmark, we will find the best option. Test performance will be the following command:
  ab -n 10000 -c 5 http: // localhost: 1337 / 


I will not give the entire conclusion of the benchmark, so as not to overload the article. An indicator of the average number of processed requests per second (RPS) will be enough to draw conclusions in most cases.

So, let's start with the simplest option, when all calls to the access library are synchronous:

 var cfg = require("../config").cfg; var fb = require('../../firebird'); var util = require('util'); var http = require('http'); http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); var con = fb.createConnection(); con.connectSync(cfg.db, cfg.user, cfg.password, cfg.role); var rs = con.querySync('select * from rdb$relations'); var rows = rs.fetchSync("all",true); con.disconnect(); res.write('['); rows.forEach(function(r){ res.write(JSON.stringify(r)+','); }); res.end(']'); }).listen(1337, "127.0.0.1"); console.log('Server running at http://127.0.0.1:1337/'); 


Total transferred: 199310000 bytes
HTML transferred: 198670000 bytes
Requests per second: 52.47 [# / sec] (mean)

I left the number of bytes transferred to ensure that all variants of the scripts generate the same result.
The Firebird library allows you to replace all calls with asynchronous ones. Here is what it will look like:

 var cfg = require("../config").cfg; var fb = require('../../firebird'); var util = require('util'); var http = require('http'); http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); var con = fb.createConnection(); con.connect(cfg.db, cfg.user, cfg.password, cfg.role,function(){ con.query('select * from rdb$relations',function(err,rs){ res.write('['); rs.fetch("all",true,function(r){ res.write(JSON.stringify(r)+','); }, function(err){ con.disconnect(); res.end(']'); }); }); }); }).listen(1337, "127.0.0.1"); console.log('Server running at http://127.0.0.1:1337/'); 


Total transferred: 199310000 bytes
HTML transferred: 198670000 bytes
Requests per second: 129.26 [# / sec] (mean)

Simple replacement of synchronous calls to asynchronous gave a performance boost up to
129 requests per second versus 52. The asynchronous approach allows you to execute requests in parallel by establishing separate connections for each client. In addition, in the synchronous version, the entire result of the request is first written to the memory, and then given to the client. In the asynchronous, the result lines are given to the client as they are sampled from the database on the fly.

However, in this code there are still non-optimal moments that can be optimized. For example, this is what each client connection creates a new database connection. As an extreme case, we try to do a single connection. First for the synchronous version:

 var cfg = require("../config").cfg; var fb = require('../../firebird'); var util = require('util'); var http = require('http'); var con = fb.createConnection(); con.connectSync(cfg.db, cfg.user, cfg.password, cfg.role); http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); if(!con.inTransaction) con.startTransactionSync(); var rs = con.querySync('select * from rdb$relations'); var rows = rs.fetchSync("all",true); res.write('['); rows.forEach(function(r){ res.write(JSON.stringify(r)+','); }); res.end(']'); con.commitSync(); }).listen(1337, "127.0.0.1"); console.log('Server running at http://127.0.0.1:1337/'); 


Total transferred: 199310000 bytes
HTML transferred: 198670000 bytes
Requests per second: 159.55 [# / sec] (mean)

As you can see, this option is faster than the previous asynchronous. Establishing a connection to the database is a rather time consuming operation. Connection reuse is a well-known and long-used technique. It is worth taking on board.

Now we will try to add asynchrony. The following code is obviously not optimal - the use of a single connection and asynchrony will not allow us to execute requests in parallel. However, for the purity of the experiment ...

 var cfg = require("../config").cfg; var fb = require('../../firebird'); var util = require('util'); var http = require('http'); var con = fb.createConnection(); con.connectSync(cfg.db, cfg.user, cfg.password, cfg.role), busy = false, next = []; http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); function doReq(){ con.query('select * from rdb$relations',function(err,rs){ res.write('['); rs.fetch("all",true,function(r){ res.write(JSON.stringify(r)+','); }, function(err){ res.end(']'); con.commit(function(){ busy = false; var n = next.pop(); if(n) n(); }); }); }); } function process(){ busy = true; if(!con.inTransaction) con.startTransaction(doReq); else doReq(); } if(busy) next.push(function(){ process(); }); else process(); }).listen(1337, "127.0.0.1"); console.log('Server running at http://127.0.0.1:1337/'); 


Total transferred: 199310000 bytes
HTML transferred: 198670000 bytes
Requests per second: 129.85 [# / sec] (mean)

The asynchronous version looks rather cumbersome. And gives a total of 129 requests per second. The problem is that the use of a single connection does not give a win for asynchronous sampling, because the Firebird client library does not support asynchronous calls by design. The Firebird module for NodeJS starts all asynchronous calls in parallel threads using the built-in libuv pool. You cannot use a single connection to perform parallel queries. Therefore, the code contains checks for busy connections and organizes a queue of calls. Apparently, the organization of the queue was in this case a retarding factor. After all, the synchronous version was faster. But using a separate database connection for each client is also not always optimal. The logical solution would be to use a pool of connections.

 var cfg = require("../config").cfg; var fb = require('../../firebird'); var util = require('util'); var events = require('events'); var http = require('http'); function ConnectionPool() { events.EventEmitter.call(this); this.conns = []; this.busy = []; this.MaxConns = 5; this.newConn = function(){ var c = fb.createConnection(); c.connectSync(cfg.db, cfg.user, cfg.password, cfg.role); this.conns.push(c); }; this.get = function(cb) { var self = this; var c = this.conns.pop(); if(c) { this.busy.push(c); cb(c); } else if((this.busy.length) >= this.MaxConns){ this.once('release',function(){ self.get(cb); }); } else { this.newConn(); this.get(cb); } }; this.release = function(con){ for(var i=0;i<this.busy.length;i++) { if(this.busy[i] == con){ this.conns.push(this.busy[i]); this.busy.splice(i,1); var self = this; process.nextTick(function(){ self.emit('release'); }); return; } } }; } util.inherits(ConnectionPool, events.EventEmitter); var pool = new ConnectionPool(); pool.setMaxListeners(2000); function exec(con,res){ con.query('select * from rdb$relations',function(err,rs){ res.write('['); rs.fetch("all",true,function(r){ res.write(JSON.stringify(r)+','); },function(err){ res.end(']'); con.commit(function(){ pool.release(con); }); }); }); }; http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); pool.get(function(con){ if(!con.inTransaction) con.startTransaction(function(err){ if(!err) exec(con,res); }); else exec(con,res); }); }).listen(1337, "127.0.0.1"); console.log('Server running at http://127.0.0.1:1337/'); 


Total transferred: 199310000 bytes
HTML transferred: 198670000 bytes
Requests per second: 187.82 [# / sec] (mean)

There was a lot of code, but the performance gain is obvious. Moreover, the connection pooling code can be put into a separate module for reuse, and the database operation code will remain simple and compact.

But you can go further. When multiple clients use the same request, Firebird allows you to prepare resources for a request and use it many times without having to repeat the resource allocation and parsing of the request. So, the use of a pool of prepared queries seems to be more optimal:

 var cfg = require("../config").cfg; var fb = require('../../firebird'); var util = require('util'); var events = require('events'); var http = require('http'); function StatementPool() { events.EventEmitter.call(this); this.conns = []; this.busy = []; this.MaxConns = 5; this.newConn = function(){ var c ={ conn: fb.createConnection() }; c.conn.connectSync(cfg.db, cfg.user, cfg.password, cfg.role); c.stmt = c.conn.prepareSync('select * from rdb$relations'); this.conns.push(c); }; this.get = function(cb) { var self = this; var c = this.conns.pop(); if(c) { this.busy.push(c); cb(c); } else if((this.busy.length) >=this.MaxConns){ this.once('release',function(){ self.get(cb); }); } else { this.newConn(); this.get(cb); } }; this.release = function(con){ for(var i=0;i<this.busy.length;i++) { if(this.busy[i] == con){ this.conns.push(this.busy[i]); this.busy.splice(i,1); var self = this; process.nextTick(function(){ self.emit('release'); }); return; } } }; } util.inherits(StatementPool, events.EventEmitter); var pool = new StatementPool(); pool.setMaxListeners(2000); http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); pool.get(function(con){ var exec = function(){ con.stmt.exec(); con.stmt.once('result',function(err){ res.write('['); con.stmt.fetch("all",true,function(r){ res.write(JSON.stringify(r)+','); }, function(err){ res.end(']'); con.conn.commit(function(){ pool.release(con); }); }); }); }; if(!con.conn.inTransaction) con.conn.startTransaction(function(err){ if(!err) exec(); }); else exec(); }); }).listen(1337, "127.0.0.1"); console.log('Server running at http://127.0.0.1:1337/'); 


Total transferred: 199310000 bytes
HTML transferred: 198670000 bytes
Requests per second: 214.37 [# / sec] (mean)

Despite the large amount of code, this option gives more than 210 requests per second.

Using multiple connections simultaneously, prepared requests and asynchronous calls gives a gain of at least 4 times compared with the fully synchronous option. However, this figure is true only for my test environment. It should also be understood that the asynchrony itself in NodeJS does not allow to increase the performance of the database server, it only helps to load it in full. In this case, the database server was allocated 256MB of memory and 1 processor core - this is how I explain the overall small RPS performance. However, the memory load did not exceed 3%. Also, the asynchronous approach allows parallel processing of other requests not related to database requests. There were no such requests in these tests. The most influential factor in the number of RPS was the sample size of the request. So, while limiting the sample to 10 entries (about 1kb in the resulting JSON) and allocating 4 processor cores to the database server, we managed to get about 1000 RPS, but the synchronous version in this case produced about 200 RPS. It is worth considering that the asynchronous version, fulfilling 1000 queries to the database per second, is able to simultaneously draw HTML from them using templates, while there is no synchronous one.
All test scripts are part of the Firebird access module for NodeJS
And available on github .

Source: https://habr.com/ru/post/139708/


All Articles