Ragnarök is coming! Or Opera 11.50 on the way

Ragnarök - Viking browser with new HTML5 parsing algorithm!

This week will debut our laboratory build called Ragnarok, which contains our implementation of the HTML5 parsing algorithm. We would like you to test this build. You can pick up the following links.

This week you will see a cool example of HTML5.
')
The Internet is replete with games on, HTML5 video players, drag-and-drop whiz-bangs and other HTML5 and HTML5 examples. But here you will see a really interesting example, probably the best this week. Are you ready?
So:

 <b><b><i>Yo!</b></i></b>

I see that you are surprised, so let's see what is the matter. Items are incorrectly nested, in this case the tag

<i>

must be first and closed. And how do different browsers build DOM?

We can verify this with Opera Dragonfly and its equivalents or with Ian Hickson's DOM viewer

Internet Explorer 9 and Safari 5 will give the following result:

 <!DOCTYPE HTML> <html><HEAD></HEAD><BODY> <B><I>Yo!</I></B><I></I> </BODY></html>

While Opera, Firefox and Chrome will give this

 <!DOCTYPE HTML> <html><HEAD></HEAD><BODY> <B><I>Yo!</I></B> </BODY></html>

All browsers figured out the wrong nesting, but not consistently, note that Internet Explorer and Safari have an extra blank element.

<i>

, while Opera, Firefox and Chrome do not have it. What is right? In HTML4, both options are correct. Because the HTML4 specification describes only what to do with good markup, but not bad, and we know that 95% of the web does not pass validation. Thus, browsers are left to the mercy and themselves are forced to think about what to do with bad markup, since error handling in the HTML4 specification is not provided.

Such simple markup already produces a very different DOM, and now imagine
what will be the result of a larger number of real examples with tags where tens and hundreds of elements. Writing a javascript code that should work equally in all browsers with such inconsistencies is one of the main causes of hair loss and crying among web developers.

Fortunately, there is currently a solution to this problem.

Algorithm parsing HTML5 code.
The HTML5 specification includes rules for parsing any markup, both valid and non-valid. After all the browsers get their HTML5 parsing algorithms, the same markup will produce the same DOM in all relevant browsers.
There are two main consequences of this:

Javascript coders will be hilarious and lush
Users may expect less incompatibility issues between their favorite sites and the browser.

So is validation a thing of the past?
Absolutely not. It still remains a vital quality assurance tool and just the fact that the HTML5 parsing algorithm will reproduce a compatible DOM does not mean that this is the DOM that you need.

Implementation in Opera
Our old HTML parsing algorithm was based on what was written 15 years ago. He was constantly writing to keep up with changing standards and a variety of ways so as not to follow the specification. After all the changes, the code began to look like a perekrashennuyu Christmas tree and adding new features has become very difficult without knocking all over the tree.

With the decision to rewrite the parsing algorithm, it became possible to clear the entire design.

Now we can proudly state that the new Ragnarök parser is being tested for compliance with the HTML5 specification, based on html5lib, at 99.9%. The missing 0.1% will be realized by the time Ragnarök goes gold. The entire test suite will also be published and you will be able to see for yourself, well, play around with different browsers.

Ragnarök also scores 11 out of 11 (plus 2 bonus points) at a few incomplete (and therefore misleading) html5test.com (two bonus points for embedded SVG and MathML in HTML5)

Memory consumption
The main reason we kept the old algorithm for so long is its efficient use of memory when working with bad markup. Instead of duplicating nodes as specified in the HTML5 specification, our algorithm had a complex system of pointers that determined which nodes should be duplicated. This saved him from unnecessary memory allocation, however, and greatly complicated the entire code. Now we have moved to copying the nodes, which requires a bit more RAM. Before the final release, we minimize this side effect, because Opera has always taken care of the efficient use of memory and work on small devices.

Performance
This is not obvious right now as it is a technical demo and it is not optimized for speed as releases, however this is another advantage to increase performance. So, since HTML code parsing time is relatively small compared to rendering and page loading, the overall performance gains will not be so noticeable, but all the performance improvements are for the better?

Grab while hot!
Download links above.

Attention
This is a technical demo and some things in it may not work, for example, m2 has some problems.

Your comments about the parser or the error to leave here .

Source: https://habr.com/ru/post/114507/

All Articles

Ragnarök is coming! Or Opera 11.50 on the way

`<b><b><i>Yo!</b></i></b>`

`<!DOCTYPE HTML> <html><HEAD></HEAD><BODY> <B><I>Yo!</I></B><I></I> </BODY></html>`

`<!DOCTYPE HTML> <html><HEAD></HEAD><BODY> <B><I>Yo!</I></B> </BODY></html>`

More articles: