Optimize ES2015 Proxy in V8

This is a translation of a post from the official blog of the V8 JS engine. The article is short, the text is small, more like a fascinating story about the problems that await the unsuspecting employees of Google in the V8 code. It will focus on speeding up ES6 Proxy processing in V8, which will be available in Chrome 62 and Node v9.x , and quite a bit about how best to use a proxy to get the maximum speed.

Introduction

Proxies appeared in JavaScript with the adoption of the ES2015 standard. They allow you to intercept the fundamental operations of objects and redefine their behavior. Proxies are the basis of libraries such as jsdom or Complink RPC library . Recently, we have put a lot of effort into improving proxy performance in V8. This article sheds some light on general approaches to improving performance in V8 and for proxies in particular.

A proxy is "objects used to override fundamental operations (for example, access to properties, assignment, enumeration, function call)" (from MDN). More information can be found in the full specification . For example, the following code example adds logging of a call to any property of an object:

const target = {}; const callTracer = new Proxy(target, { get: (target, name, receiver) => { console.log(`get was called for: ${name}`); return target[name]; } }); callTracer.property = 'value'; console.log(callTracer.property); // get was called for: property // value

Proxy creation

The first thing we notice is the creation of a proxy. The initial implementation in C ++ repeated the steps from the EcmaScript specification, which resulted in at least 4 jumps between C ++ and JS runtimes, this is evident in the underlying diagram. We wanted to translate this implementation to a platform- independent CodeStubAssembler (CSA), running in a JS runtime environment. This porting would minimize the number of hops between language execution environments. CEntryStub and JSEntryStub on the diagram - this is the execution environment. The dotted line shows the boundaries between execution environments. Fortunately, the majority of helper predicates were already in the CSA, so the initial version was concise and readable.

The diagram below shows the flow of control when a proxy works with any interceptor (in this example, the interception is apply , which is called when the proxy is used as a function), it is drawn using the following code:

 function foo(...) {...} g = new Proxy({...}, { apply: foo }); g(1, 2);

After porting the interceptor call to CSA, all calls occur in the JS environment, reducing the number of "jumps" between languages from 4 to zero.

This change resulted in the following performance improvements:

Our JS performance measurements show accelerations from 49% to 74%. Roughly speaking, we measured how many times a specific microbenchmark can be launched in 1000 ms. For some tests, the code is run several times to clarify the result (due to the limited accuracy of the timer). The code for all benchmarks below can be found in our js-perf-test directory.

`Call` and `construct` interceptors

The next part shows the results of optimizing call interceptors and creation (they apply and construct ).

Significant performance increase when calling a proxy - up to 500% faster! And the acceleration of creating a proxy is not so remarkable, especially if no interceptors are defined - in this case, the acceleration is only 25%. We got these results by running the following command in the d8 shell :

Where test.js is a file with the following contents:

 function MyClass() {} MyClass.prototype = {}; const P = new Proxy(MyClass, {}); function run() { return new P(); } const N = 1e5; console.time('run'); for (let i = 0; i < N; ++i) { run(); } console.timeEnd('run');

It NewObject out that most of the time is spent in the NewObject function and in the functions it NewObject , so now we are thinking how to speed it up in future releases.

Get-interceptor

The next section is about how we optimized the most used operations - reading and writing properties through a proxy. It turned out that the get interceptor is more confused than the previous examples, due to the behavior of the inline cache. You can see more about inline caches in this video .

Finally, we got a working port on the CSA with the following results:

After applying the changes, we noticed that the size of the Chrome apk-file for Android grew by ~ 160Kb, which is more than expected for a small function in 20 lines, but, fortunately, we keep similar statistics. It turned out that the function is called twice from another function, which is called 3 times from the third, which is called 4 times. The cause of the problem was aggressive inline functions. In the end, we solved the problem by putting the function into a separate stub ( here, apparently, the same stubs are meant , which were called "predicates" above ), which saved precious kilobytes - the final version increased the size of the apk-file only ~ 19Kb .

Has-interceptor

The next section shows the results of optimizing the has interceptor . We thought it would be easy (expecting to reuse most of the get-interceptor code), but it has its own atmosphere. Partly because of the difficult to debug problem bypassing the prototype chain when called by the operator. Improvement results range from 71% to 428%. And again, the gain is more noticeable if the interceptors are defined at creation.

Set interceptor

Now we go to the set interceptor . And this time we need to work differently with named and indexed properties (elements). These two types are not part of the JS language, but the result of internal optimizations of object property processing. The initial implementation of the proxy still leaves the execution environment (for elements), which again leads to the intersection of execution environments. Nevertheless, we achieved an improvement of 27% to 438% for cases where the interceptor is determined, but at the cost of a slowdown of 23% if not determined. The performance drop here is due to additional checks to distinguish between indexed and named object properties. There are no improvements for indexed properties yet. Here is a graph with full results:

Results with real use

Obtained in jsdom-proxy-benchmark :

The jsdom-proxy-benchmark project is (in the literal sense of the word is: collects into one html-file) ECMAScript specification using the Ecmarkup tool. In jsdom@11.2.0 version (which is the basis of Ecmarkup), it uses a proxy to implement structures such as NodeList and HTMLCollection . We used this as a benchmark to measure production in a more realistic scenario than our synthetic micro-benchmarks. Over 100 passes, the average results are:

Node v8.4.0 (without proxy optimizations): 14277 ± 159 ms
Node v9.0.0-v8-canary-20170924 (with only half optimized interceptors): 11789 ± 308 ms
The difference in results is about 2.4 seconds, which means an improvement of ~ 17%
Translation NamedNodeMap to proxy improved processing time by:
- 1.9 seconds in V8 6.0 (Node v8.4.0)
- 0.5 seconds in V8 6.3 (Node v9.0.0-v8-canary-20170910)

Thanks for the results provided by TimothyGu .

Obtained in Chai.js :

Chai.js is a popular assertion library that uses proxy fairly tightly. We did something like a benchmark using real scripts; and running tests for different versions of the V8 revealed a gain of more than one second out of four. On average, for 100 launches:

Node v8.4.0 (without proxy optimizations): 4.2863 ± 0.14 s
Node v9.0.0-v8-canary-20170924 (with only half optimized interceptors): 3.1809 ± 0.17 s

Approaches used for optimization:

We have an established standard approach, how to overcome performance bottlenecks, and the cornerstone is the next few steps (which we followed in the work revealed in this article):

do performance tests for a single little feature
add more tests that check compliance with specifications (or write them from scratch)
study the original C ++ implementation
transfer feature to platform-independent CodeStubAssembler
optimize the code further through creating TurboFan implementation
check performance changes by benchmarks

These steps are suitable for any optimization you may need to do.

Source: https://habr.com/ru/post/339718/

All Articles