📜 ⬆️ ⬇️

Jarakan

For several months already, a small team of developers and testers has been working on creating a new ECMAScript / JavaScript engine for Opera. When the current ECMAScript engine, Futhark, was first made public, it was the fastest on the market. Futhark was designed primarily for fast code execution. Traditionally, this is the right compromise between different platforms under which Opera works.

The web is a constantly changing environment and tomorrow's advanced web applications require fast execution of scripts, so Opera again takes part in a competition to develop the fastest scripting engine.

The name Dzarakan, like the previous names of the engines, is given - it is a form of writing. (Futhark - Scandinavian alphabet; Dzarakan, also known as charakan - syllable script for the Javanese language)
')
Improvement efforts are focused on three main areas. :

Register bytecode



The latest generations of Opera scripting engines used instruction stack installation. This type of setting instructions is based on a stack of values, where most of the instructions take incoming operands from the stack value, process them and put them back on the stack. Some instructions simply either put the values ​​on the stack or mix the values ​​in it. This gives compact bytecode programs and provides ease of generating bytecode for them. (in general, the stack can be compared with a sort of card deck, upward of which you can put cards and withdraw them from there, but you cannot take them from the middle)

In the new engine, instead, instructions based on registers were used. In a register machine, instead of a stack of values ​​dynamically changing in size, blocks of fixed size are used, where values ​​are called registers. Thus, instead of being able to work only with the top of the stack, each instruction can access any register. Due to the fact that for work you need to copy less data from the top of the stack, fewer instructions are necessary for execution and less data will be copied.

Machine code generation



Despite the fact that the new method of installing instructions in the engine allows you to significantly accelerate the execution of bytecode, there are still overloaded operations with simple ECMAScript code, such as loops with an integer in bytecode interpretation. To cope with these overloads, we have implemented a compilation of ECMAScript programs and functions into native code, in whole or in part.

The compilation of machine code is based on static analysis of data types (using tools better than usual in ECMAScript), which is performed in order to get rid (where possible) of unnecessary type checks, deferred types (taking into account undefined static types), and on the register allocator, which allows you to generate a compact machine code using several operations with registers and memory.

When working with ECMAScript code, which in practice is well suited to be converted into machine code, our generated code is more or less similar to the assembler code written by hand, trying to keep everything in registers.

The register allocator is designed to be executed architecturally independently, as well as the generation of “final” machine code, which complicates the solutions we create. At first, work with 32 and 64x x86 architecture was implemented, but preliminary work has already begun, for generating machine code, for other processor architects, such as ARM, for example.

In addition to generating machine code for standard ECMAScript, we also generate machine code that looks for matches for simple regular expressions. This greatly increases performance when finding matches in long strings for simple regular expressions. For really long strings, this makes searching for substrings through regular expressions faster than the same search through String.prototype.indexOf. For short strings, the speed is limited by the overload from compiling a regular expression.

Code generation for more complex regular expressions becomes slower and gives a smaller performance gain, because the regular expression engine is fast enough. The basis of the regular expression engine was developed quite recently, but it debuted in Presto 2.2. (Opera 10a). We can say that this is a regular regular expression engine, but it does some tricks to avoid unnecessary returns, which gives it greater performance.

Automatic classification of objects



Another area for a big increase in performance in our current engine is another representation of ECMAScript objects. In the new engine, each object is assigned to a class, which in turn collects various information about the object, such as the prototype, the order and the names of some or all of its properties. Assigning classes to objects is in itself a very dynamic thing, because ECMAScript is a dynamic language, but it is organized in such a way that objects with the same prototype and the same properties are assigned to one class.

Such a change in the representation allows compact storage of custom objects, because most of the complex structures represent the properties of an object stored in a class, where they are located along with other objects of the same class. In actually working programs, storing multiple objects in one class can significantly reduce memory consumption. It is expected that most programs that create a lot of objects have only a few different classes.

The general table of structures for properties also improves the search for properties between different objects. For two objects with one class, it is true that if, when searching for the property “X” for the first object, we get the result “Y”, then for the second object the same search will also give “Y”. It is used to cache the retrieval of individual properties in ECMAScript programs for searching, and this greatly improves the performance of code containing multiple read or write properties.

Performance



So how fast is Jarakan? Using the standard cross-platform mechanism for quick debugging (without any generated machine code), Jarakan is now 2.5 times faster than the current ECMAScript engine in Presto 2.2 (Opera 10a). In this case, it is worth remembering that Opera releases special versions for different architectures, and this is important for performance.

The generation of machine code in Jarakan is not yet ready for full-scale testing, but several individual compatible tests show a gap of 5 and 50 times, which is already very promising.

This translation was edited by zerobrain so that it was written in Russian ( applause ). Taking this opportunity, I also want to ask a good person with the nickname enilight to correct karma

Source: https://habr.com/ru/post/51935/


All Articles