How to log in NodeJS to the boys in the yard respected

What infuriates you most when you are trying to organize readable logs in your NodeJS application? Personally, I am extremely strained by the absence of any sane mature standards for creating a trace ID. In this article we will talk about what options there are for creating a trace ID, let's figure out how continuation-local storage or CLS works on our fingers and call the proxy force to help all this with absolutely any logger.

Why does NodeJS even have a problem with creating a trace ID for each request?

In the old-long-old times, when the mammoths still walked the earth, all-all-all servers were multithreaded and created in a new stream to the request. In the framework of this paradigm, the creation of a trace ID is trivial, since There is such a thing as thread-local storage or TLS , which allows you to store in memory some data that is available to any function in this stream. It is possible at the beginning of the processing of the request to accumulate a random ID, put it in TLS and then read it in any service and do something with it. The trouble is that in NodeJS it will not work.

NodeJS is single-threaded (not quite, considering the appearance of workers, but within the framework of the problem with trace ID, no one plays any role), so you can forget about TLS. Here the paradigm is different - to juggle with a bunch of different callbacks within the same thread, and as soon as the function wants to do something asynchronous, send this asynchronous request, and give the processor time to another function in the queue (if you are wondering how this thing works, proudly called Event Loop under the hood, I recommend this series of articles to read). If you think about how NodeJS understands which callback when to call, we can assume that each of them must correspond to some kind of ID. Moreover, NodeJS even has an API that provides access to these IDs. We will use them.

In the old times, when the mammoths were already extinct, but people still did not know the benefits of the central sewage system (NodeJS v0.11.11) we had addAsyncListener . Based on it, Forrest Norvell created the first implementation of continuation-local storage or CLS . But we will not talk about how it worked then, as this API (I’m talking about addAsyncLustener) ordered to live long. He was no longer in NodeJS v0.12.

Before NodeJS 8, there was no official way to keep track of asynchronous events. And, finally, in version 8, NodeJS developers have restored justice and presented us with the async_hooks API . If you want to learn more about async_hooks, I recommend reading this article . Based on the async_hooks, the previous implementation of the CLS was refactored. The library is called cls-hooked .

CLS under the hood

In general terms, the scheme of the CLS can be represented as follows:

CLS overview

Let's take it a little more detail:

Suppose we have a typical Express web server. First, create a new CLS namespace. Once and for the entire lifetime of the application.
Secondly, we will make middleware, which for each request will create its own CLS context.
When a new request arrives, this middleware is called (Function # 1).
In this function we create a new CLS context (as one option, you can use Namespace.run ). In Namespace.run we pass the function, which will be executed in the context of our context.
CLS adds a newly created context to the Map with contexts on the current execution ID key.
Each CLS namespace has an active property. CLS assigns this property a link to our context.
In the context context, we make some kind of asynchronous query, say, to the database. The database driver is passed a callback, which will be called when the request is completed.
The asynchronous init hook is triggered . It adds the current context to the Map with contexts by async ID (ID of the new asychronous operation).
Since our function no longer has any additional instructions, it completes execution.
An asynchronous after hook works for it. It assigns the active property to the namespace undefined (in fact, not always, since we may have several nested contexts, but for the simplest case, this is the case).
An asynchronous destroy hook is triggered for our first asynchronous operation. It removes the context from the Map with contexts by the async ID of this operation (it is the same as the current execution ID of the first callback).
The request to the database is completed and the second callback is called.
An asynchronous before hook fires . Its current execution ID is the same as the async ID of the second operation (database query). The namespace's active property is assigned the context found in the Map with contexts by the current execution ID. This is the same context that we created before.
Now the second callback itself is running. Practicing some kind of business logic, dancing devils, pouring vodka. Inside this we can get any value from the context by key . CLS will try to find the given key in the current context or return undefined .
An asynchronous after hook is triggered for this callback when it is completed. It assigns the active property to the namespace undefined .
An asynchronous destroy hook is triggered for this operation. It removes the context from the Map with contexts by the async ID of this operation (it is the same as the current execution ID of the second callback).
The garbage collector (GC) frees the memory associated with the context object, since we have no more links to it in the application.

This is a simplified view of what is going on under the hood, but it covers the main phases and steps. If you have a desire to dig a little deeper, I recommend to get acquainted with the samples . There are only 500 lines of code.

Create trace ID

So, having dealt with the CLS, we will try to use this thing for the benefit of mankind. Let's create middleware, which for each request creates its own CLS context, creates a random trace ID and adds it to the context using the traceID key. Then inside ofigilliarda our controllers and services we get this trace ID.

For express, a similar middleware might look like this:

 const cls = require('cls-hooked') const uuidv4 = require('uuid/v4') const clsNamespace = cls.createNamespace('app') const clsMiddleware = (req, res, next) => { // req  res -  event emitters.      CLS     clsNamespace.bind(req) clsNamespace.bind(res) const traceID = uuidv4() clsNamespace.run(() => { clsNamespace.set('traceID', traceID) next() }) }

And in our controller or service we can get this traceID with just one line of code:

 const controller = (req, res, next) => { const traceID = clsNamespace.get('traceID') }

True, without adding this trace ID to the logs of the benefits of it, like a snowthrower in the summer.

Let's get a simple formatter for winston , which will add the trace ID automatically.

 const { createLogger, format, transports } = require('winston') const addTraceId = printf((info) => { let message = info.message const traceID = clsNamespace.get('taceID') if (traceID) { message = `[TraceID: ${traceID}]: ${message}` } return message }) const logger = createLogger({ format: addTraceId, transports: [new transports.Console()], })

And if all loggers supported custom formatters in the form of functions (many of them have reasons not to do this), then this article probably would not exist. So how could you add a trace ID to the logs of my favorite pino ?

Call for Proxy in order to make friends with ANY logger and CLS

A few words about the Proxy itself: this is such a thing that wraps our original object and allows you to override its behavior in certain situations. In a strictly limited list of situations (according to science, they are called traps ). The full list can be found here , we are only interested in trap get . It allows us to override the return value when accessing the property of an object, i.e. if we take the object const a = { prop: 1 } and wrap it in a proxy, then with the help of trap get we can return everything we want, when we refer to a.prop .

In the case of pino idea is as follows: we create a random trace ID for each request, create a child pino instance , into which we send this trace ID, and put this child instance in CLS. Then we wrap our source logger in Proxy, which will use this very child instance for logging, if there is an active context and it has a child logger, or use the source logger.

For such a case, the proxy will look like this:

 const pino = require('pino') const logger = pino() const loggerCls = new Proxy(logger, { get(target, property, receiver) { //    CLS  ,   target = clsNamespace.get('loggerCls') || target return Reflect.get(target, property, receiver) }, })

Our middleware will look like this:

 const cls = require('cls-hooked') const uuidv4 = require('uuid/v4') const clsMiddleware = (req, res, next) => { // req  res -  event emitters.      CLS     clsNamespace.bind(req) clsNamespace.bind(res) const traceID = uuidv4() const loggerWithTraceId = logger.child({ traceID }) clsNamespace.run(() => { clsNamespace.set('loggerCls', loggerWithTraceId) next() }) }

And we can use the logger like this:

 const controller = (req, res, next) => { loggerCls.info('Long live rocknroll!') //  // {"level":30,"time":1551385666046,"msg":"Long live rocknroll!","pid":25,"hostname":"eb6a6c70f5c4","traceID":"9ba393f0-ec8c-4396-8092-b7e4b6f375b5","v":1} }

cls-proxify

Based on the above idea, a small library cls-proxify was created. She works out of the box with express , koa and fastify . In addition to creating a trap for get , it creates other traps to give the developer more freedom. Because of this, we can use Proxy to wrap functions, classes, and more. There is a live demo of how to integrate pino and fastify, pino and express .

I hope you did not waste your time, and the article was just a little useful to you. Please kick and criticize. We will learn to code better together.

Source: https://habr.com/ru/post/442452/

All Articles