The story of an investigation into the strange behavior of XMLHttpRequest in new versions of Firefox

I. The essence of the problem.

The list of basic purposes of XMLHttpRequest, of course, does not include an HTML request, more often this tool interacts with XML, JSON or simple text.

However, a bunch of XMLHttpRequest + HTML works well when creating extensions to the browser, which in the background polls for news sites that do not provide a mailing subscription, RSS or other cost-effective APIs or provide these services with some restrictions.

When creating multiple extensions for Firefox, I was faced with such a need. Working with HTML code received from XMLHttpRequest using regular expressions is a very unreliable and cumbersome way. Getting the DOM from XMLHttpRequest was only possible for the correct XML. Therefore, you had to follow clever tips on the developers website . However, since Firefox 11, the ability to directly retrieve DOM from XMLHttpRequest has appeared , and timeout processing has been added to Firefox 12.
')
I experienced a new opportunity to create mini-indicators of new topics for two small forums, and it turned out to be very convenient (50 lines of code plus the CustomButtons extension - here’s a ready-made indicator in five minutes, with timer polls and four states: no news, no news error and timeout; you can read more here ). Everything worked like a clock.

Therefore, I tried to remove from the code of my extensions all the old crutches and introduce a new convenient parsing there. However, there was a strange problem when working with the rutracker.org site (testing takes place on the last nightly build under Windows XP; I really apologize for all the jambs in the code and the wording: I have no programmer education and my experience in this area is, unfortunately, very small .).

The following simplified code example almost all the time goes into a timeout (to check you need to log in to the site - then it becomes clear why this is significant):

var xhr = new XMLHttpRequest(); xhr.mozBackgroundRequest = true; xhr.open("GET", "http://rutracker.org/forum/index.php", true); xhr.timeout = 10000; xhr.channel.loadFlags |= Components.interfaces.nsIRequest.LOAD_BYPASS_CACHE; xhr.responseType = "document"; xhr.onload = function() { alert(this.responseXML.title); } xhr.onerror = function() { alert("Error!"); } xhr.ontimeout = function() { alert("Timeout!"); } xhr.send(null);

And the snag is in the HTML parsing in the DOM, because the site gives the page without delay and, for example, the following code without parsing works without hesitation:

 var xhr = new XMLHttpRequest(); xhr.mozBackgroundRequest = true; xhr.open("GET", "http://rutracker.org/forum/index.php", true); xhr.timeout = 10000; xhr.channel.loadFlags |= Components.interfaces.nsIRequest.LOAD_BYPASS_CACHE; xhr.onload = function() { alert(this.responseText.match(/<title>.+?<\/title>/i)[0]); } xhr.onerror = function() { alert("Error!"); } xhr.ontimeout = function() { alert("Timeout!"); } xhr.send(null);

The XMLHttpRequest specification states that when parsing HTML / XML into DOM scripts, the XSLT will be applied , that is, scripts are not processed and no resources are loaded (which is confirmed by monitoring HTTP activity with the described requests), so there can be no delay from these sides. The only catch can be only in the structure of the DOM itself: for some reason, the parsing hangs and creates a pseudo-timeout.

Ii. Additional observations.

Then I created a small script for the DOM statistics and began using it to analyze the problem page.

 var doc = content.document; var root = doc.documentElement; var text_char = root.textContent.length; var elm_nodes = doc.evaluate(".//*", root, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null).snapshotLength; var txt_nodes = doc.evaluate(".//text()", root, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null).snapshotLength; var com_nodes = doc.evaluate(".//comment()", root, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null).snapshotLength; var all_nodes = doc.evaluate(".//node()", root, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null).snapshotLength; var max_nst_lv = 0; var max_nst_lv_nodes = 0; for (var level = 1, pattern = "./node()"; level <= 50; level++, pattern += "/node()") { var elm_num = doc.evaluate(pattern,root,null,XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,null).snapshotLength; if (elm_num) { max_nst_lv = level; max_nst_lv_nodes = elm_num; } } alert( text_char + "\ttext characters\n\n" + elm_nodes + "\telement nodes\n" + txt_nodes + "\ttext nodes\n" + com_nodes + "\tcomment nodes\n" + all_nodes + "\tall nodes\n\n" + max_nst_lv_nodes + " nodes in the " + max_nst_lv + " maximum nesting level\n" );

Here are some even more puzzling data.

1. The main page of the forum with javascript disabled has: 49677 characters in text nodes, 4192 HTML elements, 4285 text nodes, 77 comments, total 8554 nodes; 577 nodes at the maximum 25th nesting level of nodes.

2. If you leave the forum and load the page for unauthorized users, you’ll get the following statistics: 47831 characters in text nodes, 3336 HTML elements, 4094 text nodes, 73 comments, a total of 7503 nodes; 1136 nodes at the maximum 24th level of nesting nodes. The structure is clearly simpler and if you try the problem code by leaving the forum (that is, on this page for unauthorized users), then no timeout occurs.

3. I tried to load the problem page on the test site and gradually simplify its structure. For example, if you delete all td elements with class row1 (headings of forums and subforums in the table on the main page) and do not change anything else, we get the following statistics: 20,450 characters in text nodes, 1355 HTML elements, 1726 text nodes, 77 comments, 3158 total nodes; 8 nodes at the maximum 25th level of node nesting. Again, this page, with very few exceptions, does not give timeouts.

4. script elements have a very strange meaning. On the front page there are 19 of them (in the head and body combined, loaded and embedded). If you delete only these elements, the page ceases to give timeouts. And if you delete from the end to the beginning, you need to delete everything (even if you leave the first downloadable script in the head, the timeouts continue). And if you delete from beginning to end, the timeouts stop after deleting the script embedded in the p element of the forum_desc hidden class in the Rules, Basic Instructions, FAQ section, you can leave 6 more scripts after it, and the timeouts will still stop. Only this script does not solve the problem). And if all 19 scripts are replaced with empty script elements without code and without the src attribute, timeouts remain. But if these empty elements are replaced with the same empty style elements in the same quantity, the timeouts immediately disappear.

5. Using the perl script, I tried to create test HTML with a more or less complex structure (but without script elements). It turned out a file of almost 10 megabytes in size with the following statistics: 9732505 characters in text nodes, 25004 HTML elements, 25002 text nodes, 1000 comments, a total of 51006 nodes; 1000 nodes at the maximum 27th level of nesting. It seems that the structure is larger and more complex than the problem page, but it does not cause any timeouts. It became obvious that the matter is in some ambiguous combination of volume / complexity / specificity of elements.

6. It was enough to add script elements to this modeled page, timeouts returned (at least I increased the timeout threshold to a minute in this complex case).

Iii. Create easily reproducible precedent.

I managed to achieve some critical minimum of the problematic structure, commensurate with the structure of the title page of the tracker, with the help of such a PERL script:

 use strict; use warnings; open(OUTPUT, '>:raw:encoding(UTF-8)', "test.html") or die "Cannot write to test.html: $!\n"; print OUTPUT "<!DOCTYPE html PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>\n" . "<html><head><meta http-equiv='Content-Type' content='text/html; charset=UTF-8'><title>Test</title></head><body>" . (("<div class='abcd'>abcd" x 25 . "</div>" x 25 ) x 10 . "<script type='text/javascript'>var a = 1234;</script>") x 20 . "</body></html>\n"; close(OUTPUT);

Page statistics: 20265 characters in text nodes, 5024 HTML elements, 5022 text nodes, 0 comments, total 10046 nodes; 200 nodes at the maximum 27th level of nesting nodes. Including 20 simplest script elements. We get 10 timeouts of 10 attempts.

When different attempts are made to simplify the structure or reduce the volume, the probability of timeouts is reduced, but in a rather unpredictable way (none of these simplifications were imposed on the other, before each script it returned to the source code):

- move all script elements to the end of the code (despite the fact that nothing else changes and the statistics remain the same): 0 timeouts out of 10 attempts.
- replacing script elements with span elements with one attribute and the same text content (without moving to the end): 0 timeouts out of 10 attempts.
- Script text abbreviations by 3 characters: 7 times out of 10.
- delete the entire contents of the script (only an empty tag remains): 6 timeouts out of 10 attempts.
- reduction of the text of the div elements to one character: 5 timeouts out of 10 attempts.
- complete deletion of the text of the div elements (a blank page is obtained): 7 timeouts out of 10 attempts.
- reduction of the class attribute of div elements to one character: 8 timeouts from 10 attempts.
- removal of the class attribute of the div elements: 1 timeout of 10 attempts.
- reduction of the number of script elements to 2 (in the middle of the code and at the end): again 10 times out of 10 attempts.
- reduction of the number of script elements to 1 (at the beginning of the code): all the same 10 timeouts from 10 attempts (but if you move this element to the end of the code, the timeouts disappear completely).
- reducing the number of div elements (and, accordingly, text nodes) by half while maintaining the maximum level of nesting: 3 timeouts out of 10 attempts.
- halving the maximum level of nesting (the total number of elements and text nodes remains almost the same, but the number of elements at the maximum level of nesting increases twice): 7 timeouts out of 10 attempts.
- reduction of the maximum nesting level to just 3 ( body/div/ text or body/script/ text) with preservation of the total number of elements: 8 timeouts out of 10 attempts.

Iv. Preliminary conclusions.

In all the cases described, no processor overload was observed, so there is no reason to blame the hardware for hang-ups (as well as network delays: the code is received in a fraction of a second, in the browser the page is rendered in a time much shorter than the timeout). Obviously, in XMLHttpRequest, some limited resources are allocated for parsing HTML in the DOM, which are exhausted by a different combination of parameters. And the mysterious role is played by script elements (which are not even executed) and especially their order in the code. If this is true, it is worth increasing the resources and reducing the strange dependence on the type and order of the elements, since the problem is by no means contrived and arises during the usual development of extensions.

V. What's next.

When I first started analyzing the problem and asked for advice on several sites, the administrator suggested a performance bug on forums.mozilla.org and advised sending a message to bugzilla.mozilla.org in the Core :: DOM section with a description of the reproducible situation. Then I had very little data, and now they are very unclear. Therefore, I will be grateful for any considerations that allow to specify the problem and articulate it clearly. Otherwise, you will have to translate all this sheet into English (which, given my level of language and material, will be very difficult) and posting on bugzilla.mozilla.org as it is, which, of course, will be a manifestation of inadequacy to someone else's time.

Source: https://habr.com/ru/post/145953/

All Articles