Node.JS - form the resulting document using other HTTP sources

Often, servers on Node.JS are used as aggregator services that receive dynamic data from other HTTP sources and form an aggregated response based on this data.

For processing the data, it is convenient to use external processes that process the original set of files (for example, ImageMagick or ffmpeg utilities).

Consider this on the example of an HTTP server that performs the backend role for the nginx server and generates CSS sprites for a set of images.

Asynchronous read / write

Client Connection Pool

The HTTP client objects in Node.JS each work with one TCP connection, executing requests one by one, so we need to organize a pool of clients (a compromise between creating connections for each other and using one connection) if we want to work really fast ( parallel).
')
We will make the most primitive pool based on the assumption that we send all initial requests to example.com : 80.

var ClientPool = function () { this .poolSize = 0; this .freeClients = []; }; ClientPool.prototype.needClient = function () { this .freeClients.push( this .newClient()); this .poolSize++; }; ClientPool.prototype.newClient = function () { return http.createClient(80, 'example.com' ); }; ClientPool.prototype.request = function (method, url, headers) { if ( this .freeClients.length == 0) { this .needClient(); } var client = this .freeClients.pop(); var req = client.request(method, url, headers); return req; }; ClientPool.prototype.returnToPool = function (client) { this .freeClients.push(client); }; var clientPool = new ClientPool(); * This source code was highlighted with Source Code Highlighter .

If you wish, you can change the architecture of the pool by allowing connections to several hosts, as well as restricting the size of the pool from above (while scattering requests over the least loaded connections). I leave it as a homework.

Receiving and saving a file

We need an asynchronous function to execute HTTP requests and save the contents to a file. Its peculiarity is that two streams of asynchronous operations are performed at once - reading the source HTTP stream, and writing to the file. Moreover, we can successfully close the file and call the callback function only upon completion of all write operations that can be performed not necessarily sequentially.

Here is an example implementation:

var getFile = function (url, path, callback) { fs.open(path, 'w' , 0600, function (err, fd) { if (err) { callback(err); return ; } var request = clientPool.request( 'GET' , url, { 'Host' : 'example.com' }); request.on( 'response' , function (sourceResponse) { var statusCode = parseInt(sourceResponse.statusCode); if (statusCode < 200 || statusCode > 299) { sourceResponse.on( 'end' , function () { clientPool.returnToPool(sourceResponse.client); }); callback( 'Bad status code' ); return ; } var writeErr = null ; var writesPending = 0; var sourceEnded = false ; var checkPendingCallback = function () { if (!sourceEnded || writesPending > 0) { return ; } fs.close(fd, function (err) { err = err ? err : writeErr; if (err) { removeFile(path); callback(err); return ; } // No errors and all written callback( null ); }); }; var position = 0; sourceResponse.on( 'data' , function (chunk) { writesPending++; fs.write(fd, chunk, 0, chunk.length, position, function (err, written) { writesPending--; if (err) { writeErr = err; } checkPendingCallback(); }); position += chunk.length; }); sourceResponse.on( 'end' , function () { sourceEnded = true ; checkPendingCallback(); clientPool.returnToPool(sourceResponse.client); }); }); request.end(); }); }; * This source code was highlighted with Source Code Highlighter .

The mechanism of interaction between nginx and our server

In order not to generate sprites for each request, we will save the output sprites, removing the oldest of them, for example, by crown. If the file already exists, nginx will give it away using the try_files rule. Otherwise, the request will be redirected to our backend, which will create the necessary file, and using X-Accel-Redirect will ask nginx to give the file from the internal location, which leads to the same physical space.

In this case, the nginx configuration will look like this somewhere:

     upstream sprite_gen {
         server 127.0.0.1:14239;
     }

     location / out_folder / {
         alias / var / sprite-gen / out_folder /;
         internal;
     }

     location / {
         alias / var / sprite-gen / out_folder /;
         try_files $ uri @transcoder;
     }

     location @transcoder {
         proxy_pass http: // sprite_gen;
     }

This example does not pretend to be perfect, with its help it is good to give out large files, including in parts, with caching.

If the files are small and it is desirable for us to better control the regeneration of sprites with missing pictures, then it is better to cache on the nginx side with a rule like proxy_no_cache $ http_pragma.

We get several files

Here is a fragment of the HTTP server responsible for getting the set of files, forming the sprite and returning the result to nginx.

var outPath = '' ; // var imageUrls = []; // . var images = []; // . var waitCounter = images.length; var needCache = true ; // , , var handlePart = function (url, pth) { getFile(url, pth, function (err) { waitCounter--; if (err) { removeFile(pth); var pth = placeholder_path; needCache = false ; } if (waitCounter == 0) { makeSprite(images, outPath, function (err) { if (err) { response.writeHead(500, { 'Content-Type' : 'text/plain' , }); response.end( 'Trouble' ); return ; } var headers = { 'Content-Type' : 'image/png' , 'X-Accel-Redirect' : outUrl }; if (needCache) { headers[ 'Cache-Control' ] = 'max-age:315360000, public' ; headers[ 'Expires' ] = 'Thu, 31 Dec 2037 23:55:55 GMT' ; } else { headers[ 'Cache-Control' ] = 'no-cache, no-store' ; headers[ 'Pragma' ] = 'no-cache' ; } response.writeHead(200, headers); response.end(); }); } }); }; for ( var i = 0; i < imageUrls.length) { handlePart(imageUrls[i], images[i]); } * This source code was highlighted with Source Code Highlighter .

We form the output file through an external process

Controlling external processes with Node.JS is easy and convenient. For convenience of debugging, we will copy the output generated by the external process into our console. To form a sprite, choose GraphicsMagick (ImageMagick fork, with a stable API and good performance).

var spriteScript = '/usr/bin/gm' ; var placeholder = path.join(__dirname, 'placeholder.jpg' ); var getParams = function (count) { return ( 'montage +frame +shadow +label -background #000000 -tile ' + count + 'x1 -geometry +0+0' ).split( ' ' ); }; var removeFile = function (path) { fs.unlink(path, function (err) { if (err) { console.log( 'Cannot remove ' + path); } }); }; var cleanup = function (inPaths, placeholder) { for ( var i = 0; i < inPaths.length; i++) { if (inPaths[i] == placeholder) { continue ; } removeFile(inPaths[i]); } }; var makeSprite = function (inPaths, outPath, placeholder, callback) { var para = getParams(inPaths.length).concat(inPaths, outPath); console.log([ 'run' , spriteScript, para.join( ' ' )].join( ' ' )); var spriter = child_process.spawn(spriteScript, para); spriter.stderr.addListener( 'data' , function (data) { console.log(data); }); spriter.stdout.addListener( 'data' , function (data) { console.log(data); }); spriter.addListener( 'exit' , function (code, signal) { if (signal != null ) { callback( 'Internal Server Error - Interrupted by signal' + signal.toString()); return ; } if (code != 0) { callback( 'Internal Server Error - Code is ' + code.toString()); return ; } cleanup(inPaths, placeholder); callback( null ); }); }; * This source code was highlighted with Source Code Highlighter .

Small nuances

Form the name for the temporary file.

To generate the file name, it is better to use Process.pid and a query counter (for example, as path.join ('/ tmp', ['source-file', Process.pid, requestCounter] .join ('-')). At that, the function request processing should receive the request counter as an argument, since the processing of the next request can start before all the steps of the current request are executed.

Clean temporary data from past processes

Let all of our temporary files be named source-pid ... or sprite-pid- ...:

var fileExpr = /^(?:source|sprite)\-(\d+)\b/; var storagePath = '/tmp/' ; var cleanupOldFiles = function () { fs.readdir(storagePath, function (err, files) { if (err) { console.log( 'Cannot read ' + storagePath + ' directory.' ; return ; } for ( var i = 0; i < files.length; i++) { var fn = files[i]; m = fileExpr.exec(fn); if (!m) { continue ; } var pid = parseInt(m[1]); if (pid == process.pid) { continue ; } removeFile(path.join(storagePath, fn)); } }); }; * This source code was highlighted with Source Code Highlighter .

Request skeleton

Suppose we want to get a sprite on a photo album from a certain point in time (timespec).

#!/usr/bin/env node var child_process = require( 'child_process' ); var http = require( 'http' ); var path = require( 'path' ); var fs = require( 'fs' ); var routeExpr = /^\/?(\w)\/([^\/]+)\/(\d+)\/(\d+)x(\d+)\.png$/; var fileCounter = 0; http.createServer( function (request, response) { if (request.method != 'GET' ) { response.writeHead(405, { 'Content-Type' : 'text/plain' }); response.end( 'Method Not Allowed' ); return ; } var m = routeExpr.exec(request.url); if (!m) { response.writeHead(400, { 'Content-Type' : 'text/plain' }); response.end( 'Bad Request' ); return ; } var mode = m[1]; var chapter = m[2]; var timespec = parseInt(m[3]); var width = parseInt(m[4]); var height = parseInt(m[5]); fileCounter++; var moments = [timespec]; addWantedMoments(moments, mode) var runner = function (moments, fileCounter, width, height) { var waitCounter = moments.length; var outPath = path.join(storagePath, [ 'sprite' , process.pid, fileCounter].join( '-' ) + '.png' ); var needCache = true ; for ( var i = 0; i < moments.length; i++) { handlePart(i, placeholder); } }; request.connection.setTimeout(0); runner([].concat(moments), fileCounter, width, height); }).listen(8080, '127.0.0.1' ); console.log( 'Server running at 127.0.0.1:8080' ); cleanupOldFiles(); * This source code was highlighted with Source Code Highlighter .

Actually, now we have a ready-made application that generates a sprite, as an aggregated result of a set of requests to other sites.

It remains to add specifics (algorithms for obtaining links to source images, formation of placeholders, if the sizes are constantly changing), and this can be used.

Actually, one of my mini-applications performs the role of a dynamic generator of sprites.

Node.JS - form the resulting document using other HTTP sources
Node.JS - Fundamentals of Asynchronous Programming, Part 1

Source: https://habr.com/ru/post/102722/

All Articles