JavaScript tooling by modifying code: areas of application and general operating principles

A little paraphrasing Wikipedia, the instrumentation is tracking the parameters of the level of performance code, the ability to diagnose errors and record information to track the causes of their occurrence.

JavaScript tooling may be necessary for a variety of reasons. The most common: debugging, profiling, tracing, logging. As a rule, the engines in which JavaScript is executed provide ways to instrument the code without changing it. In my last article I described some of the means by which this is done, and also the existing limitations that eventually led me to the beginning of the project described in that article and to study the issue of JavaScript instrumentation by automatically changing the code. This topic, in my opinion, is deprived of attention, but deserves disclosure, especially in the comments there was an interest in the conceptual approach of code modification.

So why and how can you automatically change the code?

For the simplest debugging, for example, you may need to change each script function by wrapping its body in a try-catch block.
')
Simple tracing or logging can be done by inserting console.log, profiling by inserting console.time / console.profile at the beginning and end of each function, or, if metering accuracy is not that important or the running environment does not support console.time / console.profile, good old Date.now ().

Deeper and more complete tracing may be needed for the subsequent analysis of test or scenario code coverage. The information on the execution of the code collected by instrumental instructions is stored where the tool can later take it for a report. A qualitative analysis of test coverage implies tracking the execution (and, accordingly, instrumentation) of not only the lines of code, but also the branches of logical and ternary operators.

function Foo(arg1, arg2) { if (arg1 || arg2 > 0) Bar1(); return arg2 ? Bar2() : false; }

 //  ,   function Foo(arg1, arg2) { try { ping('Foo invoked'); if ((ping('arg1 check'), arg1) || (ping('arg2 check'), arg2 > 0)) { ping('if branch'); Bar1(); } return arg2 ? (ping('Bar2 branch'), Bar2()) : (ping('false branch'), false); } finally { ping('Foo finished'); } }

Code instrumentation for subsequent tracing of this kind is performed by code coverage tools. Of those with whom I had to work and enjoyed, I can not fail to mark istanbul . The tool is written in JavaScript, which also helps its popularity in using it in grunt extensions . I use istanbul with Jasmine both for analyzing client test coverage ( PhantomJs plus grunt-template-jasmine-istanbul ) and for server code (with grunt-jasmine-node-coverage ). You can see an example of the report on the coverage of the istanbul code for yourself here .

Even more complex code modification may be needed in the visualization and analysis tools for the execution of the code mentioned in the previous article.

How can you automatically change the JavaScript code, find the right places and insert instrumentation instructions there? You can certainly try to do this with regular expressions and call the devil, as in this stackoverflow answer , but the correct answer to this question is the following: JavaScript code needs to be parsed, bypass the resulting abstract syntax tree, change the nodes of interest to us, convert the modified tree back into code.

There are many easily found JavaScript parsers, some of which we use all the time, without even thinking about the fact that it is also a parser (for example, uglify.js or various JavaScript beautifiers). In my project, I used esprima to get the original syntax tree. The tree is a hierarchical JSON describing the code being analyzed. You can play with syntactic trees, as well as see other examples of using esprima, on the site of the tool .

Traversing the tree with the modification I implemented without additional tools. However, such tools exist, for example falafel and burrito , and eliminate the need to write the infrastructure for traversing the tree, allowing you to concentrate on the task of finding and modifying the necessary nodes.

It is important to note that for many code modification tasks (for tasks of my project and for problems of code coverage analysis tools), the position of the nodes of the initial tree is important. When inserting new nodes into the tree (instrumental instructions) and the subsequent generation of the modified code, the instructions of the old code will be shifted. Instrumentation instructions describing the execution of the code should communicate the initial positions (rows / columns) of this code. Parsers can on demand include information about the position of the code in the generated tree.

I generate the code for the modified tree using escodegen , which understands the syntax tree format issued by esprima.

Unfortunately, different parsers / generators are free to use and use different syntax tree formats. Fortunately, several popular parsers use the SpiderMonkey parser API syntax tree format, and esprima / escodegen are among these parsers / generators.

In order to hide instrumentation instructions during debugging and make the client code in the debugger look like it is not instrumented, source maps can be used when generating the modified tree code. Using escodegen, all you need to do is set one flag (options.sourceMap).

In conclusion, I would like to note that non-destructive automatic modification of the code requires a good knowledge of the specification of the language (or constant verification with it). As a postscript, I can give an example of a pitfall that I came across.

In the prototype of the project, I politely wrapped everything that was possible in blocks, that is,

 for (var x in y) { //   }

turned into

 { for (var x in y) { //   } }

what i thought was non-destructive change. And everything was fine until I came across a library that broke after modification.

The reader can, if he wishes, test his knowledge / memory before reading the answer.

I knew of course that there are labels in the language, but I rarely used it in my practice and did not expect a certain behavior for the case with continue label. The breaking scenario was:

 l1: for (var x in y) { continue l1; }

(see comments on the article for a more detailed explanation)

Source: https://habr.com/ru/post/188990/

All Articles

JavaScript tooling by modifying code: areas of application and general operating principles

More articles: