📜 ⬆️ ⬇️

Atomicity of operations and counters in memcached

A series of posts about “Web, caching and memcached” continues. In the first and second posts we talked about memcached, its architecture, possible applications, the choice of caching key and memcached clustering.

Today we will talk about:

The next post will be devoted to the problem of simultaneous rebuilding caches.


Atomic operations in memcached


As such, all single memcached requests are atomic (due to its single-threadedness and correct internal locks in the multithreaded case). This means that if we execute a get request, we will get key values ​​such as someone wrote it in the cache, but not exactly a mixture of two records. However, each operation is independent, and we cannot guarantee, for example, the correctness of such a procedure in a situation of competitive access from several parallel processes:
  1. Get the key value “x” ( $x = get x ).
  2. Increasing the value of a variable by one ( $x = $x + 1 ).
  3. Writing a new variable value to memcached ( set x = $x ).

If this code is executed several frontend'ov simultaneously, it may turn out that the value of the key x will increase not n times, as we intended, but on a smaller value (the classic race condition , race condition ). Of course, such an approach is unacceptable for us. The classic answer to the current situation is the use of synchronization primitives (semaphores, mutexes, etc.), but they are missing from memcached. Another solution to the problem is to implement more complex operations that replace the non-atomic get / set sequence.
')
In memcached, there are a couple of operations to solve this problem: incr / decr (increment and decrement). They provide an atomic increase (or, accordingly, a decrease) of the integer value of an existing key in memcached. Additional operations are also atomic: append / prepend , which allow you to add data to the key value at the beginning or end, and in some replace , add and replace operations can be considered atomic, which allow you to set the key value only if it did not previously exist, or, conversely, replace the value of an existing key. One more variant of atomic operations will be discussed in the section on blocking implementation with memcached tools.
It should be additionally noted that any lock in memcached should be fine-grained, that is, it should affect as few objects as possible, since the main task of the server in any case is to provide efficient access to the cache of as many parallel processes as possible.

Memcached counters


Memcached can be used not only for storing caches of backend samples, not only for storing user sessions (which was mentioned at the beginning of the article), but also for a task that is difficult enough without memcached, implementation of real-time counters . That is, we are faced with the task of showing the current value of the counter at a given time, if we recline the “real time” requirement, this can be done through logging and subsequent analysis of the accumulated logs.
Consider a few examples of such counters, how they can be implemented, what problems are possible.

View count


Let there be some objects in our project (for example, photos, videos, articles, etc.) for which we have to show the number of views in real time. The counter should increase with each scan. The easiest option is to update the field in the database with each view, it will not work, because There are many views and the database will not sustain such a load. We can implement accurate and accurate collection of statistics on views, their accumulation, and periodic analysis, which ends with updating the counter in the database (for example, once an hour). However, the task remains to show the current number of views.

Consider the following possible solution. Frontend at the time of viewing the object generates the name of the counter key in memcached, and tries to perform an incr operation (increment) on this key. If the execution was successful, it means that the corresponding key is in memcached, we viewed the view, we also received a new counter value (as a result of the incr operation), which we can show to the user. If the incr operation returned an error, then the counter key is currently not in memcached, we can choose the number of views from the database as the initial value, increase it by one, and perform the set operation, setting the new counter value. In subsequent views, the key will already be in memcached, and we will simply increase its value using incr.

It should be noted that the above scheme is not quite correct: it contains a race condition (race condition). If two frontend simultaneously access the counter, simultaneously detect its absence, and do two set operations, we will lose one view. This can be considered not very critical, since the process of accumulating statistics will restore the correct value. If necessary, you can use locks in memcached, which will be discussed below. Or implement the initialization of the counter through the operation add , processing its result.

Online count


There is one more kind of counters, which without memcached or a solution similar to it can hardly be implemented: it is an “online” counter. We see such counters on a large number of sites, but first of all it is necessary to determine what exactly we mean by “online”. Suppose we want to calculate how many unique sessions (users) turned to our site in the last 5 minutes. The uniqueness of the user's appeal with this session for 5 minutes can be tracked by saving the last counted call time in the session, if more than 5 minutes have passed, this means a new (unique) call.



So, let's select six keys in memcached with names, for example, c_0 , c_1 , c_2 , ..., c_5 . The current variable key, we will consider the counter with the number equal to the remainder of dividing the current minute by 6 (in the figure this is the key c_4 ). We will increase it with the help of the incr operation for the treatment of each unique session within 5 minutes. If incr returns an error (there is no counter yet), set its value to 1 using set , be sure to specify a lifetime of 6 minutes. The value of the online count will be the sum of all keys except the current one (in the figure these are the keys c_0 , c_1 , c_2 , c_3 and c_5 ).

When the next minute comes, the current variable key will be the key c_5 , while its previous value will disappear (because it was created 6 minutes ago with a lifetime of the same 6 minutes). The value of the counter will be the sum of the keys from _0 to c_4 , i.e. the newly calculated key value _4 will already begin to be taken into account in the displayed value of the counter.

Such a counter can be built on a smaller number of keys. The minimum possible for this scheme are two keys: one is updated, the value of the other is shown, then after 5 minutes the counters are swapped, while the one that has just been updated is reset. In the multi-key scheme, some “smoothing” is provided, which provides a smoother counter change in case of a sharp influx or outflow of visitors.

Source: https://habr.com/ru/post/43282/


All Articles