
The author of this article did not encounter a structured overview of the important stages of caching, so he would like to share the accumulated experience in this field, combine all the basic information on this issue, and consider the pros and cons of each type of caching.
First of all, I would like to clarify that caching is one of the most important components of any project. In particular, this is the only way to do more and faster when using limited resources. And, as you know, resources are always limited: both server and user.
The main problem of caching is the quick response to requests to the main storage systems and processing of incoming and outgoing structured information.
')
Imagine that you need to quickly transfer information, but the speed of access to data is extremely low. Or another situation: the speed is good, but little available memory or channel width is insufficient, or processor and disk factors make it difficult to accomplish the task. In this case, caching is the only way out.
Types of caching
Caching (or cache) is a kind of intermediate buffer in which data is stored. Thanks to caching, the site page is not recreated for each user. Caching allows you to work with large amounts of data as soon as possible and with limited resources (server and user).
It is necessary to understand that work with data can be performed both on the client side and on the server. Moreover, server data processing is centralized and has a number of undoubted advantages (especially for the support service).
There are several types of caching, we offer to consider each type, its features and recommendations for use:
1. Browser caching or client caching
It is a compilation for the browser command to use the existing cached copy. The operation of such caching is based on the fact that when you re-visit, the browser is given the title 304 Not Modified, and the page or image itself is loaded from the local user cache. It turns out that you save on traffic between the visitor's browser and the site hosting. Accordingly, the page of your site starts to load faster.
1.1 Caching files and images
Browser caching is best suited for sites containing a large number of images: the picture is not downloaded every time you open the site, but simply loaded via the browser's cache.
This is the first level of caching, which consists in returning the
“expired” header and the
“304 Not Modified” header. The most effective is caching for 2 weeks.
However, in this case there is an important nuance: if the image on the website changes, then the browser does not immediately know about it, but only if you wait for expiry or reset the cache in the browser itself. This is not very effective if the file is constantly changing and you need to constantly give it the current version.
1.2 Cache https
Special headers like strict-security. Allows the browser to always access https to the selected domain. It saves this state quite hard and, in case of canceling this kind of cache, the browser will try to load the page via https for quite a long time, while ignoring the current headers.
1.3 Cache CA
The so-called, certificate authority stamp.
This type of caching is considered mandatory if you do not want your site users to wait for the certificate authority (and this is some server that is responsible for the authenticity of your certificate) to process the request from the user's browser and confirm that your site is indeed confirmed by it.
1.4 Caching Pages
When the page is already generated, you need to constantly monitor its relevance. To do this, you must use a server cache with tracking the change time of individual parts of the page (if the page is built from a set of dynamically generated blocks). With this approach, each response from the server has special headers indicating the time of the page change, which are then sent by the user's browser when the site page is repeated. When receiving such headers, we can analyze the current state of the page (maybe even draw it), but instead of the page content, give the header
“304 Not Modified” , which for a custom browser means that you can display the page from your (user’s browser) cache.
Of course, you can send the appropriate headers without using server-side cache tracking, but in this case, most users will get the page content updated rather late. With this approach, the browser sometimes polls the server for updates, but the frequency and rules for each browser are configured by its developer, so it’s not necessary to hope that your users will receive updates on time.
As a rule, the cache is divided by user type:
- for authorized;
- for unauthorized.
This separation is due to the unique content for each authorized user and the general content for guest users. In most sites, an unauthorized user cannot change the contents of the site, and therefore affect its contents.
Browser cache saves traffic and time spent on loading pages. But to achieve the savings effect, the user must visit our page at least once, which means that the load on server resources will decrease, but not significantly.
2. Server caching
Server caching refers to all types of caching, in which data is stored on the server side. This data is not available to client browsers. The cache is created and stored on a one-to-many basis (many, in this case, client devices).
2.1 Caching the entire page
The most efficient cache. What makes him interesting? Its biggest advantage is that the page returns almost at the moment of the request, as a result, it is the ability to process millions of requests even on the weakest server with the speed of the memory and little use of the processor.
Perhaps anyone has ever dreamed of a site that works at a speed of "ping" or faster.
But this type of cache has its drawbacks: for example, the inability to cache pages for an authorized user, or a user whose page content depends on the current user variables.
Use this cache if the server knows all the static states of external data, such as: uri, get (without additional parameters), the user is not authorized - that is, in fact, this is the ideal page state for guest users. Consider the fact that with such caching, the architecture of the site or application should always process incoming requests in the same type and give the same type of responses. Such a state is in any application or website; it only needs to be tracked and a cache applied to it.
The caching of pages entirely, most often, is used in some emergency cases, while the page cache is stored for a predetermined time (from 2 minutes), during which the responses from the server are of the same type (do not allow the browser to cache it).
2.2 Caching the results of compiling php files
There are both a clean compilation of the code and its optimization during compilation (script substitution). The most vivid examples:
-
APC ;
-
XCache ;
- Compilation with the substitution of scripts
HipHopVirtualMachine .
Both types of caching can be used in the project, but each has its own nuances that must be considered when writing code.
2.3 Caching individual page blocks
This is perhaps the most interesting, but also a complex type of caching. However, it can also be effective, and by its example it is easiest to explain the principles of caching in general.
It is necessary to keep track of: the state of the tables, the state of the user's session, whether to turn off caching during POST or GET requests (http query), dependence on the current address, persistence of caching (if previous conditions change) or its dynamic adjustment.
Caching individual blocks of pages better than other types of caching is suitable if you need, for example, to reduce the number of database requests from real (authorized) users. By the way, with properly defined dependencies, it will work even more efficiently than all subsequent types of caching.
Why is this kind of caching so important? The thing is that the expansion of the database server pool is a much more difficult task than the expansion of the server pool of the php-part of the site. Moreover, php caching state conflicts are much easier to resolve than conflicts when working with multiple databases.

2.4 Caching php based on unshared resources
It is best suited for standardizing requests, getting data from shared resources, and having internal variables that php resources access several times when a page is generated.
2.5 Caching php based on shared resources
Use this caching to store serialized data. For example: configuration file, table state, file system lists.
2.6 mysql caching based on query cache
This is a fairly well-known and most illuminated topic. Nevertheless, I would like to consider the specifics of working with the timestamp and how to avoid a permanent reset of the query cache.
Surely, you regularly come across a situation when you need to give new materials, the date of publication of which is already allowed by the current timestamp? Simply put,
WHERE show_ts <= UNIX_TIMESTAMP ()
If you use the ever-changing timestamp in such queries, then the sql cache will not only be useless, but even harmful, since the number of cached queries will be accumulated, the data of which are outdated at the time the cache is created.
We propose the following way out:
As a rule, any material is published at certain points in time. For example, 00:00. All that needs to be done is to create a query that will evaluate the table by the maximum date, while the smaller current one.
Sort of:
SELECT SQL_NO_CACHE MAX (show_ts) ... WHERE show_ts <= UNIX_TIMESTAMP ();
Yes, this query will not be cached, but all queries to this table will be cached if there are more than one of them. This simple operation will significantly improve the life of sql-caching.
It makes sense to cache these queries if the reads from the table are slightly larger than the records.
2.7 Caching mysql output, aggregating tables
There is a rule: data updates should be significantly less than reads to return them.
That is, it does not make sense to aggregate what will change at the same moment, while the relevance of the aggregated data is important.
What to choose for aggregation? Usually this is some kind of statistical information about the number of records, the date of the last update, the author of the last update, and the like.
Conclusion
Given the constant network load, without caching you will not be able to create any project. Caching makes it possible to deliver data to a large circle of customers, while using minimal resources. In this article we looked at many types of caching, among which, we are sure, there will be a suitable solution for your project.