📜 ⬆️ ⬇️

Optimization killers

image

This post contains tips on how not to write code whose performance will be much lower than expected. This is especially true when the V8 engine (used in Node.js, Opera, Chromium, etc.) refuses to optimize some functions.

V8 Features


There is no interpreter in this engine, but there are two different compilers: regular and optimizing. This means that your JS code is always compiled and executed directly as native. Think it means fast? You are mistaken. Compiling into native code doesn't improve performance too much. We only get rid of the use of the interpreter, but the non-optimized code will work so slowly.

For example, in a normal compiler, the expression a + b would look like this:
')
mov eax, a mov ebx, b call RuntimeAdd 

This is just a call to the corresponding function. If a and b are integer, then the code will look like this:

 mov eax, a mov ebx, b add eax, ebx 

And this option will work much faster than the call, which at run time handles complex additional JS semantics. In other words, a regular compiler generates a non-optimized, "raw" code, and an optimizing compiler brings it to mind, leading to the final form. At the same time, the performance of the optimized code can be 100 times higher than the performance of the “normal” one. But the fact is that you can't just write any JS code and optimize it. There are many programming patterns (some of them even idiomatic) that the optimizing compiler refuses to process.

Note that if a template is not optimized, then it affects the entire function that contains it. After all, the code is optimized one function at a time, and the system does not know what the rest of the code does (unless it is built into a function that is being optimized at the moment).

Below, we will look at most of the templates whose functions fall into the “hell of de-optimization”. Most often it makes sense to change them, and the proposed solutions may become unnecessary when the compiler learns to recognize all new and new patterns.

1. Using the built-in toolkit


To determine how patterns affect optimization, you should be able to use Node.js with some V8 flags. Create a function with a pattern, call it with all sorts of data types, and then call the internal V8 function to test and optimize:

test.js:
 // Function that contains the pattern to be inspected (using with statement) function containsWith() { return 3; with({}) {} } function printStatus(fn) { switch(%GetOptimizationStatus(fn)) { case 1: console.log("Function is optimized"); break; case 2: console.log("Function is not optimized"); break; case 3: console.log("Function is always optimized"); break; case 4: console.log("Function is never optimized"); break; case 6: console.log("Function is maybe deoptimized"); break; case 7: console.log("Function is optimized by TurboFan"); break; default: console.log("Unknown optimization status"); break; } } // Fill type-info containsWith(); // 2 calls are needed to go from uninitialized -> pre-monomorphic -> monomorphic containsWith(); %OptimizeFunctionOnNextCall(containsWith); // The next call containsWith(); // Check printStatus(containsWith); 

Run:

 $ node --trace_opt --trace_deopt --allow-natives-syntax test.js Function is not optimized 

To test the functionality, comment out the with statement and restart:

 $ node --trace_opt --trace_deopt --allow-natives-syntax test.js [optimizing 000003FFCBF74231 <JS Function containsWith (SharedFunctionInfo 00000000FE1389E1)> - took 0.345, 0.042, 0.010 ms] Function is optimized 

It is important to use the built-in toolkit to check if the selected solutions work.

2. Unsupported syntax


Some constructs are not explicitly supported by the optimizing compiler because they use a non-optimizable syntax.

Important: even if the structure is unavailable or not executed, it still does not allow optimizing the function containing it.

For example, it is useless to do so:

 if (DEVELOPMENT) { debugger; } 

This code will affect the entire function, even if the debugger expression will not be executed.

Currently not optimized:


Most likely, not optimized:


To avoid misunderstandings: if a function contains any of the following, it will not be fully optimized:

 function containsObjectLiteralWithProto() { return {__proto__: 3}; } function containsObjectLiteralWithGetter() { return { get prop() { return 3; } }; } function containsObjectLiteralWithSetter() { return { set prop(val) { this.val = val; } }; } 

Direct calls to eval and with deserved special mention, because everything they work with is in the dynamic scope, which means that these expressions can have a negative impact on many other functions if it becomes impossible to analyze what is happening there.

Workaround : some of these expressions cannot be discarded in the finished product code. For example, from try-finally or try-catch. To minimize the detrimental effects, they should be isolated within small functions:

 var errorObject = {value: null}; function tryCatch(fn, ctx, args) { try { return fn.apply(ctx, args); } catch(e) { errorObject.value = e; return errorObject; } } var result = tryCatch(mightThrow, void 0, [1,2,3]); // Unambiguously tells whether the call threw if(result === errorObject) { var error = errorObject.value; } else { // Result is the returned value } 

3. Using arguments


There are many ways to use arguments so that it will not be possible to optimize a function. So when working with arguments you should be especially careful.

3.1. Reassignment of a given parameter under the condition that the arguments are used in the function body (only in the unstable mode (sloppy mode))


Typical example:

 function defaultArgsReassign(a, b) { if (arguments.length < 2) b = 5; } 

In this case, you can save the parameter to a new variable:

 function reAssignParam(a, b_) { var b = b_; // Unlike b_, b can safely be reassigned if (arguments.length < 2) b = 5; } 

If this were the only way to use arguments in a function, then it could be replaced by checking for undefined:

 function reAssignParam(a, b) { if (b === void 0) b = 5; } 

If there is a possibility that the arguments will be used later in the function, then you should not worry about reassignment.

Another way to solve the problem is to enable strict mode ('use strict') for a file or function.

3.2. Flowing arguments


 function leaksArguments1() { return arguments; } function leaksArguments2() { var args = [].slice.call(arguments); } function leaksArguments3() { var a = arguments; return function() { return a; }; } 

The arguments object should not be passed anywhere.

Proxy can be done by creating an internal array:

 function doesntLeakArguments() { // .length is just an integer, this doesn't leak // the arguments object itself var args = new Array(arguments.length); for(var i = 0; i < args.length; ++i) { // i is always valid index in the arguments object args[i] = arguments[i]; } return args; } 

In this case, you have to write a lot of code, so it makes sense to first decide whether the game is worth it. Again, optimization implies a large amount of code, with more pronounced semantics.

However, if your project is at the assembly stage, this can be achieved using a macro that does not require the use of source maps and allows you to save the source code as normal JavaScript.

 function doesntLeakArguments() { INLINE_SLICE(args, arguments); return args; } 

This technique is used in bluebird, and at the assembly stage, the code turns into this:

 function doesntLeakArguments() { var $_len = arguments.length;var args = new Array($_len); for(var $_i = 0; $_i < $_len; ++$_i) {args[$_i] = arguments[$_i];} return args; } 

3.3. Assignment to arguments


This can only be done in unstable mode:

 function assignToArguments() { arguments = 3; return arguments; } 

Solution : just do not write such an idiotic code. In strict mode, such creativity will lead to exclusion.

How can you safely use arguments?



If you comply with all of the above, using arguments will not result in memory allocation for this object.

4. Switch-case


The switch-case expression today can have up to 128 case points, and if this number is exceeded, the function containing this expression cannot be optimized.

 function over128Cases(c) { switch(c) { case 1: break; case 2: break; case 3: break; ... case 128: break; case 129: break; } } 

Keep the amount of case within 128 pieces using an array of functions or if-else.

5. For-in


A For-in expression can interfere with function optimization in several ways.

5.1. The key is not a local variable.


 function nonLocalKey1() { var obj = {} for(var key in obj); return function() { return key; }; } var key; function nonLocalKey2() { var obj = {} for(key in obj); } 

The key can not be from the upper scope, as well as can not refer to the bottom. It must be exclusively local variable.

5.2. The object being iterated is not “simple enumerable”.


5.2.1. Objects in the “hash table” mode (“normalized objects”, “dictionaries” - objects whose auxiliary data structure is a hash table) are not simple enumerated

 function hashTableIteration() { var hashTable = {"-": 3}; for(var key in hashTable); } 

An object can go into hash table mode, for example, when you dynamically add too many properties (outside the constructor), delete properties, use properties that are not valid identifiers, etc. In other words, if you use an object like this, as if it is a hash table, then it turns into a hash table. In no case can you pass such objects in for-in. To find out if the object is in hash table mode, you can call console.log (% HasFastProperties (obj)) with the --allow-natives-syntax flag activated in Node.js.

5.2.2. There are fields with enumerated values ​​in the object prototype chain.

 Object.prototype.fn = function() {}; 

This line gives the property the enumerated chain of prototypes of all objects (with the exception of Object.create (null)). Thus, any function containing a for-in expression becomes non-optimizable (as long as they do not iterate over Object.create (null) objects).

With Object.defineProperty, you can assign non-enumerable properties. It is not recommended to do this at run time. But for the effective determination of static things like properties of the prototype - the most it.

5.2.3. The object contains enumerated array indices.

It must be said that the properties of the array index are defined in the ECMAScript specification :

A property name P (as a string) is an array index if and only if ToString (ToUint32 (P)) is P, and ToUint32 (P) is not 2 32 - 1. The property whose name is the array index is also called the element .

As a rule, this refers to arrays, but ordinary objects may also have array indices:

 normalObj[0] = value; function iteratesOverArray() { var arr = [1, 2, 3]; for (var index in arr) { } } 

Bounding an array using for-in is slower than using for, and the function containing for-in is not optimized.

If you pass an object that is not a simple enumeration to a for-in, this will have a negative effect on the function.

Solution : always use Object.keys and iterate over the array using a for loop. If you really need all the properties from a chain of prototypes, then create an isolated helper function:

 function inheritedKeys(obj) { var ret = []; for(var key in obj) { ret.push(key); } return ret; } 

6. Infinite loops with complex logic of exit conditions or with unclear exit conditions.


Sometimes when writing code, you realize that you need to make a cycle, but you have no idea what to put in it. Then you enter while (true) {or for (;;) {, and then insert into the break loop, which you soon forget. It is time to refactor, when it turns out that the function is executed slowly or de-optimization is generally observed. The reason may be in the forgotten condition of interruption.

Refactoring the cycle for the sake of placing the condition of output in the conditional part of the expression of the cycle may be nontrivial. If the condition is part of an if statement at the end of the loop and the code must be executed at least once, then refactor the loop to do {} while () ;. If the exit condition is located at the beginning, then place it in the conditional part of the loop body. If the exit condition is located in the middle, then you can play around with the code: each time you move part of the code from the top line to the bottom line, leave a copy of the line above the loop. After the exit condition can be tested using a conditional or at least a simple logical test, the cycle should no longer fall under deoptimization.

Source: https://habr.com/ru/post/273839/


All Articles