Author: Anton RaymerThe article is based on a webinar that I conducted some time ago. It is designed, first of all, for those who do not know how browsers work, or those who have gaps in knowledge. Probably, there will be a lot of obvious for those who are not the first day in web development. I decided to divide the article into two parts. In the first we consider the general principles of the browser. In the second part, I focus on some important points: reflow and repaint, event loop.
What is a browser?
')
Browser - a program running in the operating system. Most browsers are written in C ++. The main purpose of the browser is to play content from web resources. In most cases, the html page serves as a web resource. It can also be a pdf file, png, jpeg, xml files and other types. Among the huge number of browsers you can highlight the most popular: Chrome, Safari, Firefox, Opera and Internet Explorer. We will look at open source browsers: Chrome, Firefox, Safari.
What does the browser consist of and how does it work?

The diagram shows the browser modules, each performing its own function. Let's start with the user interface.
The user interface is what the user sees in front of him, i.e. the address bar, navigation elements, own menu, etc. Although the user interfaces are very similar to each other, there is no standard that describes them. . Historically, browsers gradually adopted the interface from each other and became more and more similar.
The browser mechanism is responsible for the interaction of the user interface and the display module, as well as for storing data in memory.
Display module This module is the most important for developers. The work of the developer, in the first place, is happening with him, and as can be understood by name - he is responsible for displaying information on the screen.
When we talk about browser engines such as Webkit or Gecko (the first is “under the hood” of Safari and until 2013 it was at Chrome, the second at Firefox), first of all we mean the display module. Next, we consider in detail the display module and analyze in more detail how it works.
The next module is
network components . It is responsible for requests over the network, takes data from external resources and interacts with the display module.
The
JS Interpreter module is responsible for interpreting the script, and its execution. There are several JS engines. The most famous are V8 and JavaScriptCore. It is important not to confuse the browser engine and the JS engine that works in the JS Interpreter module.
The next module is the
executive part of the user interface (UI backend) . She is responsible for drawing everything on the screen and the user interface.
The last module is
data storage . The browser needs to store data somewhere, usually RAM is used for this. What data need to be stored? For example, cache, own settings. Also, the data store can be indexedDB, which appeared in the html5 standard - the browser's own databases.
Display module

The display module receives data from the network module. Data comes in packets of 8 KB. What is important is that the display module does not wait for all the data to come, it begins to process and display it as it arrives. In the case of html-pages, he begins to analyze them, there is a html parsing (this is a separate big topic, I will not dwell on it). The main thing to understand is that as a result of the parsing we have a DOM tree. Also at the end of the parsing, a load event is triggered, which can be processed in the script. This means that the document is ready and the script can work with it.
DOM tree - document object model. By and large, the "interface", which provides the browser JS-engine to work with one or another html-document. Based on the DOM tree, the construction of a render tree is performed. The display tree is also an important part of the display module. By and large, these two trees - the DOM tree and the display tree - are the most important elements for the developer. The display tree repeats in many ways the structure of the DOM tree (hereafter, there will be an example where this will be presented more clearly), but it has some differences:
- The display tree contains no hidden items. If we have an html element with
display:none
registered, it will not be present in the display tree. In this case, if visibility:hidden
, then it will be in the display tree. Some DOM nodes, which are represented in the DOM tree as a single node, can be represented as several in the display tree. A striking example is the compound select tag. If there is one node in the DOM tree, in the tree the map is converted to at least three nodes. The first node is responsible for displaying the selected item. The second is for a drop-down list with possible items. And finally, the third block is responsible for the arrow. - The text in the DOM tree is represented as a simple node. The DOM tree doesn't care what it says, how many lines this text takes. While for a display tree, this is important, and the text is transformed into several nodes, depending on how many lines it occupies. This is a clearer look at a little later.
The display tree is used to make the browser understand what to display on the screen. It contains information on which blocks the page consists of. Further in the text for simplicity, I will call the components of the display tree
rectangles , so as not to be confused with html blocks.
A display tree is a collection of rectangles that should be displayed on the screen. After the mapping tree is constructed, the layout phase follows. At this stage, all rectangles are assigned dimensions and coordinates. Each rectangle gets its width and height, coordinates in the browser window. After the layout, the display tree is drawn. The user already sees the end result. The display module in each browser is arranged differently, but the work pattern is similar.
I propose to consider two browser engine: Webkit and Gecko.
Webkit The display module gets html and styles. As a result of parsing html, a DOM tree is created. As a result of CSS parsing, a style rules rule tree (Style Rules) appears. Next comes the important stage, called Attachment, which can be translated as “combination”. At this point, CSS styles are superimposed on the DOM tree, resulting in a Render Tree. After that, the layout of the tree. It is called here Layout. And in the end painting happens.

If you look at
Gecko , you will notice that the schemes are very similar. The main differences are in terminology. Here, too, parse HTML, CSS. As a result, a DOM tree is created, which is here called the Content Model. Parsing styles forms a tree of styles. Stage Attachment here is called the Frame Constructor, but, in fact, it is the same. As a result of combining a display tree is formed, here it is called the Frame Tree. The layout here is called Reflow. And drawing is called Painting, just like in Webkit.
For simplicity, we equate some terms :
- Attachment = Frame constructor = Alignment
- Render Tree = Frame Tree = Display Tree
- Layout = Reflow = Layout
Example

Here we have tags:
<head>, <p>, <div style =” display: none”> <div><img src>”…”/></div>

The display module builds a DOM tree. In this case, it will look like this. There is a root element (it is always present), it is called
documentElement
and corresponds to the
html
tag. This tree contains all the tags. And note that the text is represented as
[text node]
. And the DOM tree doesn't need to know anything more about the text. On the basis of this DOM-tree Render Tree is under construction.
Example
Display tree It also has a root element (RenderView), but you can already see the differences between the DOM tree and the display tree. First, there is no
head
tag, because it is not displayed on the screen. No
<div style =” display: none”>
, there is only
<div><img src>”…”/></div>
The text in the display tree is divided into two lines and consists of two elements: line 1 and line2. As I wrote above, the nodes of the map tree will be called rectangles. For clarity, I have displayed them in the illustration.
Example

Each rectangle has its own “parent”, except for the root element root.
The display module also handles script processing.
The order of processing scripts and style sheets
It is important to understand the order in which script processing takes place. Consider the following example, where I tried to demonstrate all possible ways to connect scripts and styles.
<html> <head> <script src="script1.js"></script> <script src="http://site.com/script3.js"></script> <script defer src="script4.js"></script> <script async src="script5.js"></script> <link rel="style" src="style.css"></link> </head> <body> ... <script src="script2.js"></script> </body> </html>
Script 1. The first thing you need to know about scripts is that when parsing
html
analyzer encounters a script, it stops further parsing of the document. That is, as soon as the analyzer reaches script 1, the browser does not know anything about what will happen next. And until script 1 is executed, further analysis of the document will not occur.
But at the same time, the browser continues to perform approximate syntax analysis. What does it mean? The browser still looks what follows the script. If there are links to external resources that need to be downloaded and downloaded, it will load this data while the script 1 is running. This is done for optimization.
At the same time, script 3 will still not be executed until script 1 is executed. By the time when script 1 is already executed, script 3 can already be fully loaded. Scripts can be inserted into the
head
and
body
tags. The difference is that in script 2, unlike script 1, almost the entire document will already be analyzed.
The script can have attributes, such as
defer
and
async
. They are similar, but they have differences:
- The
defer
attribute tells the browser not to wait for the script to finish, but to continue parsing the html page. At the same time, script 4 will be executed only after the entire html document is analyzed and the DOM tree is built. - The
async
attribute also tells the browser that a further html document can be analyzed while the script is running. At the same time, it is loaded in a parallel thread and executed immediately after loading. This means that it can be executed earlier than script1, if the latter also has an async attribute. That is, the order of connection in this case is not respected.
In the case of
defer
script 4 is always executed after script 1. With the
async
attribute, it is not known when it will be executed and how much of the document will be analyzed at this point.
Styles , unlike scripts, can not affect the document. If scripts can add additional nodes or tags, styles cannot do this. Therefore, there is no need for the browser to block further analysis of the document.
In this case there is a small nuance. For example, script 1 can work with one style or another, and access to them may be required. Those. if we want to change (or learn) some styles, but when executing script 1 they are not loaded yet, an error may occur.
Browsers are trying to take into account this nuance. Firefox, for example, if it finds some non-loaded styles in the process of orientation parsing, blocks the execution of the script, loads the styles, and then ends the execution of the script. Chrome works in a similar way, but is a bit more optimized. He stops the script only if he realizes that this script is working with not loaded styles.
Window layout

Window = Rectangle = Display Tree Node
The window layout method is determined by the following factors:
- Window type (display property).
- Positioning scheme (position and float properties).
- The size of the window.
- External information (image size, screen size).
The window layout is the stage of the layout of the display tree. I think many designers are familiar with this scheme, it is called “Box model”. I will not dwell on it in detail.
The layout of windows takes into account the following factors:
CSS property display. The two main types are inline and block. Others, such as inline-block table and others, appeared later. The difference is that display: block indicates that the width of the rectangle will be calculated depending on the width of the "parent". And display: inline indicates that the width of the rectangle will be calculated depending on its contents. If there are two words in the element, the width of the rectangle will be equal to the width required to display these words. Inline-elements line up one after another. And block elements - each other.
The next thing that affects the layout of an element is the position and float properties. Position is static by default, with the rectangle going in the standard layout stream. There is also position: relative and position: absolute. Position: relative indicates that space is allocated to the rectangle in the standard layout stream. The position of the element can be shifted relative to this place: left, right, up, down with the help of the corresponding property.
The absolute positioning, to which position: absolute and position: fixed refer, indicates that an element is outside its rectangle from the overall layout flow. The remaining rectangles do not take it into account. It also does not take into account neighboring elements. Its coordinates are calculated relative to the root element of the page, or relative to an ancestor whose position is not static. Dimensions are also calculated relative to the parent. Also positioning is affected by the float property. It indicates that our rectangle goes in the standard stream, but at the same time takes either the leftmost or the rightmost position. At the same time, all the other rectangles “flow around” this element.
In conclusion, this part is to say that the main browser thread is an endless loop that supports workflows. It waits to send events such as reflow and repaint. These events come to him from the display module. After receiving them, he performs the appropriate actions.
In Firefox, the display module runs in one thread. It is one for the whole browser. Chrome is a little different: the display module and the flow of execution of each tab are different.
It is important that the network module operates in separate parallel streams that are not associated with the display module. Therefore, the network component can use resources regardless of what happens in the display module. Typically, such a component has the ability to work simultaneously with several connections and load several files at once. In Firefox, for example, there can be six parallel threads with which you can load content, scripts, etc.
In the next part, we will examine in detail the reflow and repaint events and try to understand how competent work with them can increase the speed of the application.