Application Cache API - new features and problems

Gradually, the concept of the standard HTML5 become a reality. Browsers are starting to support new features that were so lacking. But with new opportunities, new challenges are emerging.
This article discusses the Application Cache API - a set of functions that provide advanced caching of web-application resources, and with which you can view previously downloaded websites without connecting to the Internet. I paid special attention to the practical use and problems of Application Cache.

Further under the article, the words “cache” and its derivatives mean Application Cache and work with it. The standard browser cache is referred to as “standard cache”.

General scheme of work Application Cache

At first glance, the operation mechanism of Application Cache is simple. When a user first uses the site, pages are stored in the browser cache. Then, on subsequent visits, as well as when the Internet connection is lost, previously saved data is used in a special “storage”. Control over site caching is done using a special file called manifest. Some articles use .apppcache instead of .manifest. This article uses .manifest. Although you can use your name.

Storage

“Storage” is a conditional name used in browsers to indicate the storage location of Application Cache. The mechanisms of application cache and standard browser cache are different, therefore, require separate placement. The differences in the working mechanisms between the two types of caching are as follows:

The data placed in the standard cache can be automatically, without a command from the user or server, deleted when it is filled or the action flow specified in the file headers expires. Data placed in the Application Cache repository can be deleted only at the command of the user or server.
Only files that are loaded during the page view are included in the standard cache. You can place any files downloaded from the server according to the manifest instruction into the Application Cache.

These differences have a significant impact on caching mechanisms, which will be discussed in detail at the end of the article.
')

Connect the file .manifest

Initially, you need to "enable" Application Cache. This is done by pointing the browser to the .manifest file using the manifest attribute in the html tag:

<!DOCTYPE html> <html manifest="cache.manifest">

There are two features here:

Application Cache only works if the browser treats the page as an HTML5 document. Therefore, it is desirable to specify the DOCTYPE. Without the specified DOCTYPE, there may be problems in irrelevant versions of browsers.
The file name can be arbitrary.

After that, it is enough just to provide the user with the first .manifest file, but here we have two important points:

You need to define the MIME type for the .manifest file. To do this, in the .htaccess file on your server, you need to add a line that defines a new file type:
```
 AddType text/cache-manifest .manifest 
```
The Mozilla FireFox browser has a problem with updating the contents of the manifest file, so especially for it you need to add a construct in .htaccess:
```
 <IfModule mod_expires.c> ExpiresActive On ExpiresByType text/cache-manifest "access plus 0 seconds" </IfModule> 
```

Having done these operations, we can deal with the contents of the .manifest file.

.Manifest file

Here is an example of such a file:

 CACHE MANIFEST #  1 images/logo.png CACHE: css/default.css NETWORK: index.php FALLBACK: / /offline.html

The first line contains the text CACHE MANIFEST - it means that this file is a manifest file. She is required.
Comments in the file are indicated by the hash symbol #.
Blank lines or lines with spaces are ignored and are used only for logical separation.
In the example, the commentary is followed by the so-called “explicit entry” (images / logo.png) - an analogue of the CACHE section. It can be used, as well as in conjunction with this section, and replacing it.
Each rule (resource) must be written on a separate line.
Three sections follow. The section names, like the text CACHE MANIFEST, must be uppercase. Each section performs its functions. The order of the sections is not important. There must be at least one section. Sections may be empty.
Let's break down each section.

CACHE section

This section is the default section, i.e. if there are no headers in the .manifest file (“explicit entry”), then the content of this file is taken as the content of the CACHE section. Inside this section are paths or URIs to cached resources. There are the following features:

Specifies a specific resource (file), i.e. You cannot write a line like / images / * in this section
The page containing the definition of the .manifest file (page c) is automatically cached. You can add the manifest attribute to all pages to cache the entire site without listing them in the .manifest file. Note: Mozilla Firefox does not cache a dynamically generated page (.php) with the manifest attribute.

NETWORK section

This section contains the paths to the files that must be downloaded from the Internet. In this section, you can use patterns, i.e. You can write this structure:

 CACHE MANIFEST NETWORK: *

This will allow the download of all files that do not have a saved copy. If you need any restrictions on the download, you can specify specific files (for example, index.php) and directories (for example, / images / *).

FALLBACK section

This section indicates that the browser in offline mode display when accessing pages that have not been cached. The rules are written according to the following principle:
_ _
You can separate the paths and spaces and tabs.
Regarding this section, I have the following features:

You can use patterns to specify the requested page. In the original example, we set the display of the page offline.html for all non-cached files:
```
  CACHE MANIFEST FALLBACK: / /offline.html 
```
You can use the instructions for specific files or files with a specific extension (for example, * .html /offline.html), which is very convenient.
Displayed pages are cached automatically and you do not need to specify them in the CACHE section.

Notes on NETWORK and FALLBACK sections

There are three comments that cannot be omitted:

All files that are not marked in the NETWORK and FALLBACK sections and that do not have saved copies in the Application Cache repository will not be loaded, even if there are copies of resources in the standard browser cache. Therefore, if you do not use common patterns (* or /), then do not forget to specify all the files necessary for the application, otherwise you will not be able to work with them. It is enough to use the general pattern in one of the sections.
The NETWORK and FALLBACK section rules do not overlap the CACHE (“explicit entry”) section rules, that is, when working without a network, the cached data is loaded first, and only then the policy regarding the resources specified in these two sections is determined. In practice, this leads to the complexity of the application structure.
The rules in these sections do not affect the page containing the definition of the .manifest file — it is always automatically cached.

Application cache API

When declaring the manifest attribute, we have the opportunity to work with the cache management object for this document window.applicationCache. The set of functions and methods of working with this object form the Application Cache API.
Consider this object in detail.

Methods of the applicationCache Object

Appeal to the object is as follows:

 window.applicationCache

Now the methods themselves.

 window.applicationCache.status

The method returns a numeric value corresponding to the status of the cache. The following statuses are possible:

UNCACHED - the cache has not yet been initialized (numeric value 0);
IDLE - no actions are performed with the cache (numeric value 1);
CHECKING — checks the .manifest file (numeric value 2);
DOWNLOADING — resources are loaded to cache them (numeric value 3);
UPDATEREADY - loading of necessary resources is completed and their initialization is required using the swapCache () method (numeric value 4);
OBSOLETE - the current cache is obsolete (numeric value 5).

Also, state constants are defined in the applicationCache object (for example, cache.IDLE is 1). So it’s not necessary to remember numeric values.

 window.applicationCache.update()

The method initiates the process of checking the .manifest file and the subsequent download of the required resources.

 window.applicationCache.swapCache()

The method switches the browser to use new cached files instead of old ones. The page is not redrawn, only upon subsequent access to the cached files, they are taken from the updated cache.
A simple alternative to the method is to reload the page, for example, using location.reload ().

ApplicationCache Object Events

The following events are associated with the applicationCache object:

cached - occurs when the first cache is formed in the storage;
checking - occurs when sending a request to receive a .manifest file;
downloading - occurs when downloading resources to the cache;
progress - occurs when loading each resource separately;
error - an error occurred while accessing the resource files or the .manifest file;
noupdate - occurs when confirming that the .manifest file has not been updated;
obsolete - occurs upon confirmation that the cache in the storage is outdated and will be deleted;
updateready - occurs when the update of the updated cache is finished.

Detailed Application Cache

Everything seems to be well and smoothly, but how does it work in reality? Now we will deal with this issue.
The first thing to consider is the efficient, step-by-step scheme of Application Cache:

When the browser first loads the document with the manifest attribute, an additional request occurs and the .manifest file (the checking event) is received. On the first visit, when the cache of this document does not yet exist, all files are loaded into the cache according to the rules specified in the file .manifest (downloading event). Files can be loaded from the standard browser cache, which is effective when rules are changed. As a result, the first data store of the cache manifest (cached event) is created.
When you re-access the same document, it is loaded from the cache, as it was previously cached (documents with the manifest attribute are always cached). Access to the server at this moment does not occur. The cache management object for this document, window.applicationCache, after loading the document, initiates a check of the change in the .manifest file and sends a request to the server (the checking event). The server is responding about the change.
If the .manifest file has not changed (the noupdate event), then the download is complete and the application will continue to work in the intended direction. Requests to the server to receive cached files are not made.
If the .manifest file has changed, then the standard loading of the document is performed and the background cache loading begins (downloading event). After the download (updateready event), the page is not redrawn. Updated cache data will only be used the next time the document is loaded.
If an error occurred while loading the .manifest file or it contains errors, for example, a link to a nonexistent file (error event), then the download will be performed in normal mode using saved copies of the resources. In this case, the cache is not affected and the browser does not use it.

Cache update

Separately, it is necessary to consider updating (clearing) the cache. From the analysis of the detailed Application Cache operation scheme, it can be seen that a change on the data server does not automatically update this data in the cache. You can update the cache in the following ways:

The user manually deletes the old cache data.
It should be understood that Application Cache and the standard browser cache are not the same. As mentioned above, both caching mechanisms differ in their work algorithms, functionality and storage. After clearing the standard cache, we will not affect Application Cache. To clean the Application Cache, you need to use the “Storage” cleaning functions - a new settings item in modern browsers.
How to clear the storage in the Mozilla Firefox browser can be found here - support.mozilla.com/ru/kb/okno-nastrojki-panel-dopolnitelnye#w_ahaalaklai-eaulikioi .
How it is done in Opera - help.opera.com/Windows/11.50/ru/storage.html .
In Google Chrome, clearing the cache is not possible in simple ways. To do this, use the chrome: // appcache-internals / management page.
Safari clears the cache along with the standard browser cache. See help.
Making changes to the .manifest file. Any changes initiate the download of all cache resources and its subsequent update. Changes will take effect after the next page reload. For example, you can do this by changing a few commented characters (we changed the version of our original example):
```
  CACHE MANIFEST #  2 
```
Software update using Application Cache API. This method will be discussed below in the example.

Only the last two methods are suitable for use in real projects, which would be quite convenient if it were not for the “aggressive caching” of Mozilla Firefox browser.

Hidden update problem

Mozilla Firefox browser has a “peculiar” cache update mechanism known as “aggressive caching”. The problem manifests itself when we combine standard caching and Application Cache, and we need to combine them if we want to have caching in older browsers.
The fact is that when updating Application Cache in Mozilla Firefox, the data is also taken from the standard cache without checking for changes on the server. This leads to the fact that obsolete data from the standard cache can get into the updated Application Cache, i.e. The update will be performed, but old versions of the files will be in the cache. This fact puts all the advantages of working with Application Cache into question.
The correct ways to update the cache in this situation are:

change file name;
2-step update.

In the first case, we simply change the name of the changed file in all resources where it is mentioned, including the .manifest file. The file with the original name should not exist (!). As a result, during the next change check, there will be an error loading the nonexistent file, and the cache resources will be loaded from the server.
If the option of changing the file name is not suitable for one reason or another, then you need to use a 2-step cache update.
At the first stage of the update, we exclude the modified file from the CACHE section of the .manifest file and set the expiration date in the header of the given file:

 header('Expires: Mon, 26 Jul 1997 05:00:00 GMT'); header('Cache-Control: no-store, no-cache, must-revalidate'); header('Pragma: no-cache');

As a result, for active users, this file will be updated in the standard cache, and it will be removed from Application Cache.
In the second stage, we already include the modified file in the CACHE section, and it is successfully cached. Unfortunately, this update will only affect active users, so try to use file renaming.

Domain Binding

Another important point that needs to be mentioned is that the cache is linked to the domain, and not to the page where the .manifest file is declared. Cached resources will be used for all pages of this domain, even if they do not have a link to the .manifest file. It is quite convenient - you can determine cached resources on the start page, and then do not worry about caching them on other pages.
If we use several different .manifest files for different pages of the same domain, they will overlap each other. Only those resources that were specified in the last downloaded .manifest file will be stored in the cache.
Also, when using multiple .manifest files, you can create a situation where the saved pages cyclically cache each other. The occurrence of such a situation leads the application to a non-working state. Given the above written, it should be noted that using several .manifest files is impractical and risky, although this possibility exists.
Domain binding also shows that Application Cache works in frames using the frame's domain cache. This moment should be appreciated by game developers.

Work offline

Another undoubted advantage of Application Cache is the ability to work in offline mode. Just imagine - the client has temporarily stopped working the network, but he can still work with your resource.
But even here everything is not smooth. Recall one feature of Application Cache - the page on which the file is declared. Manifest always gets into the cache. The result is that the page will be loaded, guided by the section CACHE, and not FALLBACK. That is, the application will behave as if there is a connection. But how to determine that the connection is missing?
For this, the HTML5 specification defines two events, online and offline, which are triggered when creating and breaking a connection, respectively. As the bolk habrauser prompted, these events work in Safari 6 and Chrome 21. In other browsers, they most likely are not implemented.
You can also use the online property of the navigator object:

 window.navigator.onLine

This property returns true if there is a connection and false if it is missing. It turns out that during the initialization of the page, you need to check this property and then build the work based on its value. Currently, the navigator.onLine property is supported by Mozilla Firefox 2 browsers, Internet Explorer 4 and later.
Usage example:

 <!DOCTYPE HTML> <html lang="ru" manifest="cache.manifest"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>Application Cache Test</title> </head> <body> <style> html {font-family: "DejaVu Sans", "Geneva CY", "Verdana"; background: #FFFFFF; OVERFLOW: hidden; ; border: 0;} body {width: 880px; height: 600px; position: relative; background: #999999; margin: 20px auto; box-shadow: 0 0 15px 10px #999999; -webkit-box-shadow: 0 0 15px 10px #999999; -moz-box-shadow: 0 0 15px 10px #999999;} /*  */ .progressbar {display:none; position:absolute; left:0px; top:0px; width:880px; height:600px; background-color:#333333; z-index:256; border:1px solid #333333;} .progressbar #progresstext {position:absolute; left:0px; top:200px; width:880px; color:#66FF00; text-align:center; font-size:36px; text-shadow:0 0 0.8em #AAFF00, 0 0 0.8em #AAFF00;} .progressbar #progress {position:absolute; left:100px; top:300px; width:600px; height:40px;} /*  */ .flash {position:absolute; left:0px; top:0px; width: 880px; height: 600px; background-color:#666666; z-index:51; border: 1px solid #333333;} </style> <!--   --> <div id="flash" class="flash"></div> <!--   --> <div id="progressbar" class="progressbar"><div id="progresstext"></div><progress id="progress"></progress></div> <script type="text/javascript" src="js/jquery-1.5.min.js"></script> <script type="text/javascript"> //   var progress_value = 0; var progress_max = 1; $(function() { //   if (navigator.onLine) { alert(' '); } else { alert('    '); } //   Application Cache cache = window.applicationCache; if (cache) { //    //   .   . cache.addEventListener('cached', function(e) {ProgressHide();}, false); //   . progress_max -  .    cache.addEventListener('downloading', function(e) {ProgressShow(); progress_max = 3;}, false); //   .    cache.addEventListener('progress', function(e) {ProgressChange();}, false); //  .   .  .  . cache.addEventListener('updateready', function(e) {ProgressHide(); window.applicationCache.swapCache(); location.reload();}, false); } }); //     $(document).keyup(function(event){ //   shift+1     if (event.shiftKey && event.keyCode == 49) { window.applicationCache.update(); } return false; }); //-------------------     ----------------// function ProgressShow() { $("#progressbar").show(300); progress_value = 0; } function ProgressChange() { progress_value++; $("#progress").attr({max: progress_max, value: progress_value}); } function ProgressHide() { $("#progressbar").hide(300); } //-------------------------------------------------------------------------// </script> </body> </html>

Possibilities and problems of using Application Cache

The information described above is sufficient to understand how Application Cache works. You can make the first conclusions about the possibilities and difficulties.
The features and benefits of Application Cache include:

Increasing the speed of loading pages, due to the fact that you do not need to spend time downloading files that have been cached.
Reducing the load on the server, because instead of numerous requests to resources (meaning the data cached on the client) to check their changes, we have only one request to the .manifest file.
Ability to cache files according to predefined rules.
Permanent file storage, with tight change control.
Ability to work with the application in offline mode.
The cache works in frames.

Problems with using Application Cache include:

When documents are first loaded after the landing page is displayed, the background download of cached documents automatically begins. On a weak channel, active work with the application after the first boot can turn into a “nightmare”. Especially if the volume and number of files is large.
. , . . .
. , .manifest .
, NETWORK FALLBACK , .
.manifest .
The rules of the FALLBACK section for working in offline mode do not overlap the rules of the CACHE section. In practice, this leads to the complication of the structure of the application to work offline.
The page containing the definition of the .manifest file is automatically cached. That does not allow it to be easily replaced in offline mode.

You should also pay attention to the following nuances:

Resources received by a POST request are not cached and are not taken from the cache.
By default, the size of the cached data is limited to 50 MB. For older versions of Mozilla Firefox and Google Chrome, the limit is 5 MB. Also in Opera browser, the size of the possible cache can be increased in the settings.
The encoding of the file .manifest can be any, but if you do not want problems with Cyrillic characters, you should use the utf-8 encoding.
Application Cache . Application Cache Google Chrome 4.0, Mozilla Firefox 3.5, Internet Explorer 10, Opera 10.6, Opera Mobile 11, Safari 4 .
Application Cache «». Application Cache . Mozilla Firefox.
Application Cache , , Cache-Control: no-store.

?

Despite a whole bunch of usage problems, Application Cache is a powerful optimization mechanism for loaded systems. It will be especially useful for games and interactive applications that use FullAjax. Proper use of Application Cache will not only increase the speed of loading pages for users and reduce the amount of traffic, but also reduce the load on your server, because cached pages are not loaded from the server, but from local storage.

I hope that this article will allow you to discover the full power of Application Cache.

Source: https://habr.com/ru/post/151815/

All Articles