The article on modular loading was the first sign in a series of notes in which I am going to tell you step by step on what principles
YASS is built
on and what were the results of testing for maximum performance of each part of this library. But first things first.
Formulation of the problem
About the simplest: what do we want to achieve? We want, by specifying an arbitrary string of a CSS selector that
conforms to the specification , to output an array of all the elements corresponding to this very string. It seems so far simple.
')
To illustrate the specification, the following examples can be given:
// returns the element with the identifier my_id
querySelectorAll ('# my_id')
// returns all elements with class external
querySelectorAll ('. external')
// returns all paragraphs on the page
querySelectorAll ('p')
However, here we can note one thing: very often we need to select just an element by its identifier or find all the elements with a specific class. These operations are quite common in all JavaScript libraries, so they should be performed as quickly as possible. It is extremely imprudent to run the entire mechanism for analyzing the input selector string simply in the case where we need to return a single element specified using an identifier. Here we can use the principle of lazy programming: “do not do what can not be done” - and to accelerate the work sufficiently enough for the simplest cases.
If you look at modern JavaScript libraries, then everywhere such a check is already done using a regular expression. And then immediately a little trick: the initialization of any regular expression is quite expensive in browsers (although the complexity of the expression begins to affect the elapsed time only if it is long), and it would be reasonable to do without it at all. And when you can just use
indexOf
to check for a substring, you should always use this (in elementary cases).
In the same case, if the regular expression provides correctness and replacing it does not lead to loss of code readability and incomprehensible gain (or even loss) in execution speed, you can replace, for example,
exec
with a bunch of
test
and
charAt
/
substr
: this will increase performance by about 20%. If this section of code is executed in a loop multiple times, this can be quite significant.
In YASS, this problem is solved as follows:
// check if the selector satisfies the simple case
if (/^[\w[:#.[[\w\RL**||=====$/.test (selector)) {
// in the case of a positive result, initialize the variable
// which is responsible for '#', '.' , ':' or '[' at the beginning of the selector
var firstLetter = selector.charAt (0);
...
}
From simple to complex
But let's look at how the overall task of parsing CSS selectors is solved. If we take into account that the selector can be specified in the form
p a.link, form input[type=radio]
, then the logic of its analysis can be written schematically in the form:
- Select the sequence of selectors that are between the commas. Next, we work with each sequence separately. At the output, all sequences are combined into a final array (
sets
). - In the sequence of selectors, we have a set of elementary selectors that are “nested” in each other (for our example,
p a.link
). We need to break the sequence into parts and disassemble each such part, given that the parent elements for the next part will be the selected elements from the previous one. For the "transformation" of the child nodes in the parent (directly the process of growing up turns out) is an array of nodes
. - Each elementary element will have to be parsed using a regular expression to isolate the parts responsible for the identifier, class, and CSS 2/3 modifiers. Disassemble the fastest with
exec
, and then write to the variable parts of the resulting array:
single = regexp.exec (single);
tag = single [1];
id = single [2];
...
- And finally, the third cycle goes through all the parent elements and tries to select from them the child nodes corresponding to the parameters set in the CSS selector.
As we can see, the basic logic of this task includes at least one regular expression (using
indexOf
and
substring
will be much more resource intensive with such complexity) and 3 cycles (which should be made as fast as possible). Recently there was an article about
the fastest methods of sorting elements , I will not list all the possibilities, I will just focus on some aspects.
Brute force array
Let us declare some array
a
, with the elements of which we perform some actions. We need to iterate over all elements strictly ascending (order is important), i.e. just
while(i--)
we cannot use. The most common method now is the usual
for
:
var j = 0,
item,
len = a.length;
for (var j = 0, item = a [j]; item; item = a [j ++]) {
item ++;
}
Naturally, he is 30–40% slower than the following
while
:
var j = 0,
item,
len = a.length;
while (j <len) {
item = a [j ++];
item ++;
}
However, if we need to perform any actions with an array element, then without caching it into a local variable can not do. In this case, the next option with
while
(through checking the existence of elements with an increment) will be even faster by 5–10%:
var j = 0,
item;
while (item = a [j ++]) {
item ++;
}
It is obvious that for all three cycles in
YASS it is used.
If the order of the elements is absolutely not important to us (for example, you just need to find the right one or return
false
), then it will be logical to use the reverse
while
:
while (idx--) {
sets [idx] .yeasss = null;
}
It is this code that is used to reset an element of a flag that marks the state "selected." Let's consider why this flag is needed, and why it should be reset.
Uniqueness of elements
One of the main problems in the selection of elements for the CSS-selector is the correctness of the final set. In the case of
div p
, we may have nested
div
, and if we just iterate through all the parent elements and combine the resulting children, then the final set will contain duplicates. To avoid this kind of error, we need to somehow mark the elements that we have chosen.
This is a standard task, and it is solved, in principle, also standard: in all libraries, a certain property of DOM nodes is brought up, which is responsible for the “selected” state. This approach has several drawbacks (for example, you need to extend this solution to the case of asynchronous samples of elements and understand that every access to elements of the tree is resource-intensive), but in most cases it allows eliminating non-uniqueness of elements.
Schematically present the work of this flag can be as follows.
for (child in children) {
if (! children [child] .yeasss) {
if (last) {
children [child] .yeasss = 1;
}
newNodes = children [child];
}
}
In the process of enumerating the child nodes we need, we check if this descendant has the
yeasss
flag
yeasss
(of course, besides this check we have to make sure that we really need this descendant). Next, we set the flag only if we work with the last link in the chain of “children-parents”, and write the child node to the array of new nodes that will become “parents” at the next iteration of the cycle.
It is clearly seen that the
yeasss
flag will be set only if we work with the last link: it will never be set for intermediate links, therefore we do not extend any restrictions on the parent elements. It is possible that it would be possible to reduce the array of parents at each iteration (in order not to choose the same elements), however, this step will not increase (in the general case, only decrease) the sample performance, therefore this approach is not used.
The flag (described in the previous section) is reset for the final array of elements (after all cycles have been completed) and is necessary both for the correctness of further samples and for preventing memory leaks in IE (advanced properties of elements cause micro leaks in IE6 / 7).
Drawing the line
YASS was created and created to search for and implement the most productive methods for solving a certain range of tasks. It can be used both for educational purposes and for purely practical ones (for example, to compile a set of CSS selectors that are not used on the site — with the help of YASS this is implemented most quickly).
To be continued...