📜 ⬆️ ⬇️

Best HTTP / 2 prioritization to speed up the web


HTTP / 2 promised to speed up the web significantly, and Cloudflare had long since deployed HTTP / 2 access for all clients. But one HTTP / 2 feature, prioritization, did not meet expectations. Not because it is fundamentally broken, but because of the implementation in browsers.

Today, Cloudflare proposes to change the HTTP / 2 prioritization, which gives our servers control over prioritization decisions that really speed up the Internet.

Historically, it is the browser that controls how and when to load web content. Today, for all paid plans we make radical changes to this model. They transfer control directly to the site owner. On the Speed ​​tab in the Cloudflare dashboard, clients can turn on Extended HTTP / 2 Prioritization: it overrides the default browser settings for an improved scheduling scheme, which significantly speeds up access for visitors (in some cases we have seen 50% acceleration). With Cloudflare workers, site owners can go even further and fully customize their settings for their specific needs.

Current situation


Web pages consist of dozens (sometimes hundreds) of individual resources that are loaded and collected by the browser into the final displayed content. This includes visible content with which the user interacts (HTML, CSS, images), as well as application logic (JavaScript) for the site itself, advertising, analytics, and marketing tracking lights. From the user's point of view, the sequence in which these resources are loaded is very important: it affects the time when he sees the content and can interact with the page.
')
The browser is, in essence, an HTML processing engine that runs through an HTML document and follows the instructions in order: from beginning to end HTML, building the page as you go. Links to style sheets (CSS) tell the browser how to style the content of the page, and the browser will delay displaying the content until it loads the style sheet. Scripts on the page may have different behaviors. If the script is marked as “asynchronous” or “deferred”, the browser can continue processing the document and simply run the script when it becomes available. If the script is not marked as asynchronous or deferred, the browser MUST stop processing the document until the script loads and runs. Such scripts are called “blocking” because they block the browser's ability to continue processing the document.

HTML document is divided into two parts. The <head> document header is at the beginning and contains style sheets, scripts, and other browser instructions needed to display the content. After the title comes the body of the <body> document, it contains the actual content displayed in the browser window (although scripts and style sheets can also be in the body). Until the browser gets to the body of the document, the user has nothing to show, and the page will remain blank. Therefore, it is important to process the header as quickly as possible. If you are interested in the details, on the HTML5 Rocks website there is an excellent tutorial on how browsers work.

The browser is usually responsible for the order of loading the various resources needed to build the page and further processing the document. In HTTP / 1.x, there are restrictions on how many objects a browser can request from any server at a time (usually 6 connections and only one resource at a time per connection), so the order of requests is strictly controlled by the browser. In HTTP / 2, the situation is completely different. The browser can request all resources at once (at least as soon as it becomes aware of them), and provides the server with detailed instructions on how to deliver these resources.

Optimum resource loading order


For most parts of the page loading cycle, there is an optimal order that speeds up the accessibility of the page for the user to the maximum (and the difference between the optimal and non-optimal loading order can reach 50% or more).

As described above, before the browser can display any content, CSS and JavaScript block it in the <head> section. At this stage, it is more profitable to use 100% of the channel to download blocking resources, rather than loading them in order, as they are written in the HTML code. This allows the browser to analyze and launch each element while loading the next blocking resource, which creates an optimal pipeline.



Script loading time for parallel or sequential loading is not different, but with sequential loading, the first script can be processed and executed during the loading of the second one.

After loading the blocking resources, the situation becomes a bit more interesting. Here, the optimal load may depend on a particular site or even business priorities (the choice of user content or advertising, or analytics, etc.). A separate problem with fonts, because the browser detects the necessary fonts after applying the style sheet to the displayed content. Therefore, by the time the browser learns about the font, it is necessary to display text that is already ready to be displayed on the screen. Any delays in loading the font lead to a lack of text on the screen (or the text is displayed in the wrong font).

As a rule, some compromises need to be taken into account:


Given the tradeoffs, in most cases, this strategy works well:


Thus, the user-visible content is loaded as quickly as possible, the application logic is delayed to a minimum, and invisible images are loaded in such a way as to complete the layout as quickly as possible.

Example


To illustrate, we use a simplified product category page from a typical e-commerce site:


For simplicity, we assume that all resources are the same size and each is loaded in 1 second. Downloading all resources takes a total of 20 seconds, but the order and method of loading is extremely important.



Here's what the optimal resource load will look like in a browser:




Current browser prioritization


All current browser engines implement different prioritization strategies , none of which is optimal.

Microsoft Edge and Internet Explorer do not support prioritization , so they work with the default HTTP / 2 settings, which loads everything in parallel, evenly distributing bandwidth between all resources. Microsoft Edge in future versions will switch to the use of the Chromium engine, which can improve the situation. But for now, in our example, the browser will be stuck in the page header most of the time, as the images slow down the transmission of blocking scripts and style sheets.



Visually, this leads to a rather painful experience: the user looks at the blank screen for 19 seconds, and then there is a delay of 1 second to display the text. When you are watching the animation below, be patient, because for 19 seconds it may seem that nothing is happening on the empty screen (although it is):



Safari loads all resources in parallel , sharing bandwidth based on their importance, according to Safari (blocking resources such as scripts and style sheets are more important than images). Images are loaded in parallel, but also simultaneously with blocking content.



Although Safari is similar to Edge in the sense that everything is loading at the same time, allocating a larger band for blocking resources allows you to display content much earlier:




Firefox creates a dependency tree that groups resources and then plans to either load groups one by one or share bandwidth between groups together. Within this group, resources share bandwidth and load simultaneously. Images are planned to be loaded after the style sheets that block rendering, and load in parallel, but scripts and style sheets that block rendering are also loaded in parallel and do not receive the advantages of pipeline processing.



In our example, this is a bit faster than in Safari, since the images are waiting for the loading of style sheets:




Chrome (and all Chromium-based browsers) prioritizes resources by list . This works very well for blocking resources that are optimally loaded in order, but not so good for images. Each image is loaded up to 100% before starting the next one.



In practice, this is almost the optimal loading scenario, with the only difference that images are loaded one at a time, and not in parallel:




Visual comparison


The visual difference is quite different, although technically downloading the entire content takes the same time:



Server side prioritization


HTTP / 2 prioritization is requested by the client (browser), and the server must decide what to do based on the request. A large number of servers do not support this feature at all , and the rest fulfill the client's request. Another option is to decide on the best server-side prioritization based on the client’s request.

According to the specification , HTTP / 2 prioritization is a dependency tree that requires full knowledge of all current requests in order to be able to prioritize resources relative to each other. This allows you to implement incredibly complex strategies, but this is difficult to implement well on the browser or server side (as evidenced by various browser strategies and different levels of server support). To simplify prioritization management, we have developed a simpler scheme that still has all the flexibility necessary for optimal planning.

The Cloudflare prioritization scheme consists of 64 priority “levels”, and within each level there are groups of resources that determine how to divide the connection between them:



First, all resources are downloaded at a higher priority level, then a transition to a lower level occurs.

Within a given priority level, there are three different concurrency groups:


In practice, the concurrency group “0” is useful for critical content that needs to be processed sequentially (scripts, CSS, etc.). Group “1” is useful for less important content that can share bandwidth with other resources, but where the resources themselves still benefit from sequential processing (asynchronous scripts, non-progressive images, etc.). The concurrency group “n” is useful for resources that benefit from parallel processing (progressive images, video, audio, etc.).

Default Cloudflare Prioritization


The extended prioritization option implements the “optimal” resource loading order described above. The specific priorities used are as follows:



This scheme allows you to sequentially send resources that block rendering, then send visible images in parallel, and then the rest of the page content with some level of strip sharing to balance the load between the application and the content. The caution * If Detectable is that not all browsers distinguish between different types of style sheets and scripts, but it will still be much faster in all cases. Acceleration by 50%, especially for visitors of Edge and Safari, will not be something unusual:



Setting up prioritization with workers


Faster defaults are great, but things get really interesting thanks to the ability to customize prioritization with Cloudflare Workers support, so sites can redefine the default priority for resources or implement their own prioritization schemes.

If the worker adds a cf-priority header to the response, the Cloudflare edge servers will apply the specified priority and concurrency. The header format is <priority> / <concurrency>, so the header is response.headers.set('cf-priority', “30/0”); sets the answer to priority 30 and parallelism 0. Similarly, “30/1” sets parallelism to “1”, and “30 / n” sets parallelism to n.

With such flexibility, the site can customize arbitrary priority resources for their needs. For example, to increase the priority of some important asynchronous scripts or main images: they are downloaded before the browser has determined that they are in sight.

To inform about ranitization prioritization decisions, the workers also indicate the browser-requested prioritization information in the request object that is passed to the worker’s event receiver (request.cf.requestPriority). Incoming priorities are a list of attributes separated by semicolons. It looks like this: weight=192;exclusive=0;group=3;group-weight=127 .


This is just the beginning.


The ability to customize and control the priority of responses is a basic building block for a lot of future work. We intend to implement our own advanced optimization on top of this, but with the support of workers, all sites and researchers can experiment with different prioritization strategies. Through the Apps Marketplace, companies can also create new optimization services on top of the working platform and share them with other sites.

If you are on a Pro plan or higher, go to the Speed ​​tab in the Cloudflare dashboard and turn on HTTP / 2 Extended Prioritization to speed up your site.

Source: https://habr.com/ru/post/452020/


All Articles