In this article I will try to describe the common mistake of the creators of caching systems.
It all started in the distant past, when I managed sites that were hosted on FreeBSD jail hosting, which was very limited in resources. Why is that? Because I used the pdflib extension to display reports and print forms, which was not in the set of extensions on the standard hosting. I compiled my apache and php there, uploaded the documents there and the site started working.
Everything was fine until it became necessary to show the items sold in the store on the pages of the top10 site. The SQL query that created the desired data set was completed in about 10 seconds. Keys, explain'y and any other shamanism did not help. It was necessary to do data caching. And I, after seeing how others do it, wrote the code that cached the request data.
What did my code do?
')
1. Checked whether there is an object with the necessary data in the cache, and, if there is one, took it from the cache and used the received data. If it was not there, I ran the code that generated the necessary data and saved it to the cache.
2. After using the data, the “garbage collection” procedure was launched. Objects with an expired lifetime were deleted.
It seems like everything is in order. I tested the cache on a test machine using the ab utility and got encouraging results. After that I filled the code in jail. And lay down to sleep.
However, the next day I received a letter from the hosting administration stating that my site was blocked due to the fact that it creates too much load on the SQL server.
The clue to me was given by the graphs of server load. They showed an abrupt increase in load with a periodicity approximately equal to the lifetime of cache objects. What was really? Everything is very simple.
At the moment when the cache object's lifetime expired, the cache object was deleted when the query was executed. The next http request ran a procedure for creating a cache object that lasted for a while and performed an expensive query to the SQL server. During this time there was another http request. Which also started the procedure for creating a cache object. The load on the server doubled, which resulted in a twofold increase in the execution time of SQL queries. For the increased waiting time for the execution of SQL queries, another HTTP request occurred. And so on.

How to avoid it?
1. A process that finds that a cache object is outdated should not delete it.
2. A process that executes a query, creating a newly created cache object must set a flag so that other processes do not start the update procedure.
3. After receiving fresh data, the cache object must be replaced with an atomic (fast) operation and after that the flag will be cleared.
For homework, please check how your caching system is built.