
Ever ran your app runtime? Do you know how many requests do that gray endpoint that? And how long are those cross-references for a similar type of resources from each entity page that need to be returned to the request? Have you tried to measure how long the user has to wait due to optional query fields, which he adds from time to time? Have you ever wondered what will happen if you parallel these six queries to those two databases?
If anything above sounds interesting, or at least familiar, welcome to cat.
chrome: // tracing
If you enter 
chrome: // tracing in the Chrome address bar, you will be taken to this page:
')

The tool itself is very close to the V8 (and chrome) trace data itself. And more recently, node also 
allows natively to trace runtime with chrome support: // tracing as a frontend. To trace, simply click the cherished “Record” and make several passes over the adjacent tab, after which you can observe in detail what exactly the browser did when you diligently chatted the cursor inside the window, randomly turning the mouse wheel. For our convenience, kind people even wrote a very 
detailed article on how to interpret the results.
The format of the data consumed by this visualizer is 
well documented , and is great even for tracing things that are distant from rendering, calculating garbage markup and trapping. Due to its openness and ability to download external files, this tracer is fairly easy to integrate with any environment capable of any I / O. In addition to the visualization itself, there are some built-in capabilities for analyzing groups of events. In general, a useful and flexible tool that can be useful and great help, where in the traditional case it would be necessary to parse tons of logs in search of certainty.
The entire frontend tracer, by the way, is written in vanilla JS and lies on 
GitHub . The chrome: // tracing itself is a regular web page and can be inspected through DevTools like any other. That is, if the tool can not satisfy you ... completely, then it can be safely mashed under the necessary scenario.
In the author's experience, the need for such an instrument appeared while working on a serverless system for the interethn of things. In certain scenarios, the number of launches of lambda functions reached tens and hundreds to perform a single operation. Analyzing all the insides of these launches by just the logs was not fun at all. I wanted something that would help to look at the picture with one look at the whole. The samopisny tracer, who could instruct the insides of our functions and all the queries they did, helped. On the mountain, the author then did not even suspect the existence of chrome: // tracing, so the frontend was written from scratch on d3, to which a separate post can be dedicated (tldr; do not render anything complicated via DOM, only canvas and WebGL). Let's now look at a few examples where tracing helped with optimization:
Warming up the cacheGiven: an asynchronous memory service and several sub-services that call it.
Problem: with several calls to the caching service in a row, the result does not have time to get into the cache, respectively, instead of one request, the service makes n.
Solution: warming up the cache (prefetch, works if the series of calls needs the same dependency), or lazy blocking memoization (the name is improvised)
class CachingService { constructor() { this.cache = null; } doAction() { if (this.cache) { return this.process(); } else { return anotherService() .then(result => this.cache = result) .then(this.process); } } } Promise.all([ cachingService.doAction(), cachingService.doAction(), cachingService.doAction() ]);  
  Undocumented requestsIt is given: one of the SDKs with each initialization did additional blocking requests (and let the whole world wait) about which we did not suspect, they were not logged in any way and were not documented.
Solution: SDK instrumentation to disable this “harmful” request.
 Abnormally long execution of custom scriptsGiven: tracing showed that user scripts for several lines are executed two orders of magnitude longer than expected. Debug showed that the problem is excessive use of variables from the global skoupa.
Solution: simulation of global scopes for user code through function arguments
 const vm = require('vm'); const scope = { property: 'value',  
The code above simulates the global variables for the script through the arguments of a virtual function (hello, CommonJS).
  Premature data normalizationGiven: Some entities are quite large (2.5-3Mb)
Problem: suspicion that such entities are long and expensive to download over the network.
Solution: tracing showed that loading such entities is one of the cheapest operations and there are much deeper problems that need to be solved. The data helped to concentrate refactoring on parts of the system, which gave a multiple increase in performance after improvement.
 In addition to these examples, there were a lot of cases when the tracer helped to quickly and accurately assess the effects of changes, made it possible to look at the whole picture, avoiding the “line-by-line” information absorption, generally manifesting itself as a tool that simplifies life.
The device is a simple tracer
Turning to the practical implementation of the simplest tracer, they can be divided into two large groups, according to the method of code instrumentation:
- Centralized - all capture points are set from a single point in the code. The tracer calls the necessary application modules and overloads the entry points to them.
- Distributed - all capture points are determined at the place of need, directly calling the tracer code. Directly determine the beginning of the event and its end.
The centralized approach is good because the instrumented code remains clean and almost independent of the tracer. But, at the same time, there are implicit dependencies that need to be explicitly supported. In this case, for example, special annotations, or a separate integration test for the tracer, which confirms that all necessary modules are in their places along with capture points, can help.
The distributed approach is good because it can be very situational to determine the duration and content of events, including in them whole blocks of useful code. Also, it is simpler for the initial implementation, because the tracer is free from the need to abstract potential errors in an abstract way and to ensure a constant flow of execution.
The most trivial example of the second type tracer might look like this:
 class Tracer { constructor() { this.events = {}; } start(meta) { const id = Math.random().toString(16).substr(2); const start = Date.now(); this.events[id] = { start, id, meta }; return id; } stop(id) { this.events[id].stop = Date.now(); } } 
Everything is pretty trite and the tracer should not worry about any third-party effects of the code being programmed. Also, never use anything like `Math.random (). ToString (16) .substr (2);`, this example is purely for illustration, there are much more 
correct ways .
As soon as we want to move to a centralized approach, we will need an abstract interface for setting capture points:
 addHook(path, method) { const module = require(path); const originalMethod = module[method]; module[method] = function() { const event = this.start(); const result = originalMethod.apply(module, arguments); this.end(event); return result; } } 
This method can be called in the only place that will be the heart of the configuration. Normal monkey-patching, even nothing to add. It is easy to see that the method has several drawbacks:
- Module location relative to the tracer
- No input checks
- No support for asynchronous functions.
The first drawback is caused by poor forethought of the interface of this function. Making the tracer allow file paths for instrumentation is a dubious idea, since For the end user, this adds an extra layer of implicit processing, which leads to doubts about the tracer interface. “Should I pass that path?”, “Should the path be relative to my file or the tracer file?”, “Should I write a standard footprint with __dirname and path.resolve?” - this can be avoided by simply passing the module itself, which need to tool:
 addHook(module, method) { const originalMethod = module[method]; module[method] = function() { const event = this.start(); const result = originalMethod.apply(module, arguments); this.end(event); return result; } } 
In addition to more explicit behavior for the user, this allows you to simplify the validation for existence:
 isInstrumentable(module, method) { return module && (typeof module[method] === 'function'); } addHook(module, method) { if (this.isInstrumentable(module, method)) { const originalMethod = module[method]; module[method] = function() { const event = this.start(); const result = originalMethod.apply(module, arguments); this.end(event); return result; } } } 
Behavior with an invalid configuration can be hard (throw an error) or soft (ignore the module, continue operation). Theoretically, a good tracer should allow to follow any of the paths using the configuration, since both options can be justified in different cases.
Processing asynchronous functions is a slightly more interesting task. First, you need to determine if the instrumented method is asynchronous, and change the behavior of the tracer to determine the end of the event. To do this, you can modify our overloaded method as follows:
 addHook(module, method) { if (this.isInstumentable(module, method)) { const tracer = this; const originalMethod = service[method]; service[method] = function () { let result; const eventId = tracer.start(); try { result = originalMethod.apply(this, arguments); } catch (error) { tracer.stop(eventId, { success: false, error: errorToString(error) });  
From the interesting, the tracer must guarantee the absence of third-party effects at run time. Therefore, it must forward any word errors back into the application flow.
That's all, such a primitive tracer can already be used for something useful. Of the additional features that can be puzzled:
- Support of before and after capture points for the code being instructed, with passing arguments and return result for any other additional actions
- Support of events of various types by name, it is just convenient
- Support for saving metadata about the success of each event
- Support for saving results from an instrumented method
- A library of standard capture points for node.js public interfaces and popular libraries
- Support for export in other formats, for example: Open Tracing
- The ability to remotely start tracing sessions with sending the result to the client
- Protection against unintentionally long sessions (in our memory there were cases when the included tracer generated several gigabytes of data and dropped the parent application).
After the events are recorded, all that remains is to save them in a format suitable for chrome: // tracing. To do this, you can use the following function:
 function toChromeTracingEvent(event) { const parsed = url.parse(event.meta.url); const base = { id: event.id, pid: process.pid, tid: 0, cat: event.type, name: event.meta.method, args: event.meta }; return [ Object.assign({}, base, { ph: 'b', ts: event.start * 1000 }), Object.assign({}, base, { ph: 'e', ts: event.stop * 1000 }) ]; } 
The bottom line is that each event of the tracer is expanded into two objects representing the beginning and end of an asynchronous chrome: // tracing event. In addition to asynchronous events, you can create and “completed” (Completed) events. But they overlap each other when rendering in chrome: // tracing, which is why it is not so easy to analyze them: 3
After all the events have been converted, it’s just left to carefully stack them with an array in the field of the `traceEvents` object and save it somewhere, from where it will not be painfully long to search through the file explorer.
The final step is to add instrumentation to the part of your application where something implicit happens, run it and load the resulting file into chrome: // tracing.
Tracer that inspired this article .
Let the work be interesting!