📜 ⬆️ ⬇️

Dklab_Cache: tags in memcached, namespaces, statistics

The memcached community has made many attempts to write native patches for the memcached code, adding support for tags to it. The most famous of these patches is the memcached-tag project. Unfortunately, the memcached-tag is still very far from the stable version: it is easy to write a script that causes the patched memcached-server to hang. It seems that at the time of this writing there is no reliable solution to the problem of tagging at the level of the memcached server itself.

Dklab_Cache library


Dklab_Cache is (mostly) a key tag support library for memcached using the Zend Framework interfaces. The library itself is written in pure PHP. Here is the complete list of library features:
Actually, there is a class TagEmuWrapper to support tags. It is a decorator (“wrapper”) for Zend Framework caching backend classes. In other words, you can use it to transparently add tag support to any Zend Framework caching subsystem. We will consider the backend for working with memcached: Zend_Cache_Backend_Memcached, but if your project uses some other backend class, you can connect tagging to it without any special features.

TagEmuWrapper implements the standard backend interface of Zend_Cache_Backend_Interface, so from the point of view of the calling system, it is itself a cache backend. In general, Zend Framework is good because at the interface level it supports tags from the very beginning! For example, in the save () method, there is already a parameter that allows you to provide the key with tags. However, none of the backend-s in the Zend Framework does not support tags: an attempt to add a tag to some key causes an exception (in particular, for Zend_Cache_Backend_Memcached).
')
Technical details, documentation, and examples of use can be found here: dklab.ru/lib/Dklab_Cache

What are tags?


Working with a typical caching system (including memcached) consists of performing three basic operations:
Suppose we want to save a long SQL query to quickly display part of the page. In this case, we check if there is an entry in the cache cell corresponding to this request. If the cell is empty, the data is loaded from the DBMS and stored in the cache for possible future retrievals.

 if (false === ($ data = $ cache-> load ("key"))) {
     $ data = executeHeavyQuery ();
     $ cache-> save ($ data, "key");
 }
 display ($ data);

Unfortunately, in its pure form, this approach can not be applied so often. The point is that the data in the database can change, and we must somehow clear the cache cell so that the user can see the results of these changes immediately. You can use the remove () method with a key, but in many cases at the time of updating the data we simply do not know in which cells they are cached.

The problem is, in fact, much more complicated. In high-load systems, data is added to the tables several (hundreds) times per second. Therefore, the logic of tracking dependencies and checking which cache cells need to be cleared and which ones are not, becomes extremely difficult (or even completely unrealizable).

Tagging provides a solution to this problem. Every time data is written to a certain cache cell, we mark it with tags — labels representing the dependencies of this data on other parts of the system. Tags allow you to merge cells into multiple overlapping groups. In the future, we can give the command "clear all cells marked with a specific tag."

Let's modify the previous example using tags. Suppose that a SQL query essentially depends on the current user ID $ loggerUserId, so each such user is allocated a separate cell named “key _ {$ loggedUserId}”. However, the data also depends on the ID of another person, $ ownerUserId, whose profile the current user is viewing. In this case, we can mark the cell with a tag associated with the user $ ownerUserId:

 if (false === ($ data = $ cache-> load ("key _ {$ loggedUserId}"))) {
     $ data = loadProfileFor ($ loggedUserId, $ ownerUserId);
     $ cache-> save ($ data, "key _ {$ loggedUserId}", array ("profile _ {$ ownerUserId}");
 }
 display ($ data);

Now, if the data in the user's profile $ ownerUserId changes (for example, the person has changed his name), we just need to give the command to clear the tag associated with this profile:

 $ cache-> clean (Zend_Cache :: CLEANING_MODE_MATCHING_TAG, array ("profile _ {$ ownerUserId}");

Note that the cache cells of all other users will not be affected: only those that depend on $ ownerUserId will be cleared.

Actually, the phrase “mark a cell C with a tag T” means the same as the statement “cell C depends on the data described as T”. Tags are dependencies, nothing more.

A small digression: about dependencies in the code


Before continuing to talk about tags, let's go back a bit and talk about a more general concept - dependencies. What are these dependencies? In the typical case (even without using tags), we have to refer several times to the caching key in order to work effectively with the data:

 if (false === ($ data = $ cache-> load ("profile _ {$ userId}"))) {
     $ data = loadProfileOf ($ userId);
     $ cache-> save ($ data, "profile _ {$ userId}", array (), 3600 * 24);  // caching for 24 hours
 }
 display ($ data);

and then in a completely different part of the program:

 $ cache-> remove ("profile _ {$ userId}");

As you can see, the phrase “profile _ {$ userId}” has to be repeated as many as three times. And if in the first case we can remove the replay at the cost of introducing a new variable:

 $ cacheKey = "profile _ {$ userId}";
 $ cacheTime = Config :: getInstance () -> cacheTime-> profile;
 if (false === ($ data = $ cache-> load ($ cacheKey))) {
     $ data = loadProfileFor ($ userId);
     $ cache-> save ($ data, $ cacheKey, array (), $ cacheTime);
 }
 display ($ data);

... then, in the second part of the program, we cannot get rid of knowing exactly how the caching key is built, and on which parameters it depends.

Important note
The line “profile _ {$ userId}” is knowledge, and one should not underestimate the harm of spreading this knowledge across an excessively large number of independent places. In our example, knowledge is very simple, but in practice, the cache key may depend on dozens of different parameters, some of which must even be loaded from the database on demand.

The situation is in fact even worse than it might seem.

How it works in Dklab_Cache


Instead of a long explanation, I will immediately give an example of using the Slot class built in accordance with the ideology of Dklab_Cache_Frontend.

 $ slot = new Cache_Slot_UserProfile ($ user);
 if (false === ($ data = $ slot-> load ())) {
     $ data = $ user-> loadProfile ();
     $ slot-> save ($ data);
 }
 display ($ data);

To clear the cache:

 $ slot = new Cache_Slot_UserProfile ($ user);
 $ slot-> remove ();

What is better?
You have to write as many of your own slot-classes as there are types of cache storages in your program. It disciplines: looking into the Cache / Slot directory, you can immediately see exactly how many different caches are used in the program, and also on what they depend on.

Well, now, actually, about tags


Slots, among other things, support tagging. Here is an example of using tags for end-to-end caching (of course, you can also apply non-through).

 $ slot = new Cache_Slot_UserProfile ($ user);
 $ slot-> addTag (new Cache_Tag_User ($ loggedUser);
 $ slot-> addTag (new Cache_Tag_Language ($ currentLanguage);
 $ data = $ slot-> thru ($ user) -> loadProfile ();
 display ($ data);

You have to create as many tag classes as there are different kinds of dependencies in your system. Classes tags are especially convenient when it comes time to clear some tags:

 $ tag = new Cache_Tag_Language ($ currentLanguage);
 $ tag-> clean ();

As you can see, knowledge of tag dependencies is again stored in a single place. Now you simply can’t accidentally “miss” and clear the wrong tag: the system will generate an error about either a non-existent class or an incorrect type of the constructor parameter.

Conclusion


This article talks about everything at once: about cache tagging, about cache dependencies in code, and about the abstraction method from the Slot and Tag cache storage implemented in the library.

Download the library sources and examples here: dklab.ru/lib/Dklab_Cache

Source: https://habr.com/ru/post/57142/


All Articles