📜 ⬆️ ⬇️

Comparing MemCache and MongoDb for Network Cache

There was a rather extraordinary idea: in the form of a network cache tool, take not MemCache , but MongoDb and compare their performance. But to present and compare the performance of these two “caching mechanisms”, they also used other means to speed up the work of our App ( APC , RamFS , TmpFS , XCache ).
The article presents data and graphs comparing these mechanisms with a description and reasoning of the data and graphs.

Any large Internet resource sooner or later faces the problem of server loads. It would be ok if you have only one server, or, let's say, a bunch of Web-server + DB-server. In general, everything is relatively good if you have only one front-end (Front-end). But problems appear when increasing your server “zoo”! Not only do you complicate the hierarchy of connections of your servers, you also have such problems as centralized data storage - this applies to the database, static files and, of course, the cache files themselves.

At work, we are developing a fairly large online resource that has long been beyond the scope of a single server. Fortunately, we are limited to a web server and a DB server. However, the authorities have now started a new large project, in which, at the level of design and thinking through, its crazy structure is visible. A small group of programmers is now faced with a new and interesting task: to think over the structure and hierarchy of the project, the size of which has not yet been encountered by any of us personally. But now it’s not about this project, but about problems emerging from its development, namely, the main one - centralized cache storage.

This article does not imply an exhaustive source for solving this problem, however, I can share the experience that I gained from comparing and choosing means, as well as caching schemes.
')
We proceed to the main part of the article, namely to our debriefing.
To compare the performance of caching, the following caching tools were selected: APC , MemCache , RamFS , TmpFS , XCache , as well as (non-ordinary choice) MongoDB DBMS, which is rather fast.

I think we need to start with how we installed all these programs and extensions, and what difficulties we faced.
All experiments were performed on a CentOS system with 2Gb of memory and installed PHP [5.3.6], and for greater fairness, no settings were performed on the package, everything was taken “from the box”.

APC Installation [3.1.9] :
Everything is simple and easy. Install package through pecl and that's it, APC works.

Installing memcache [2.2.6] :
Simple enough too. Also through pecl install the package, and then separately install the memcache program itself, because This is not an extension, but a third-party program that needs to be installed separately, but it is not difficult. After installation, everything worked wonderfully.

RamFS installation:
It is even more creation than installation. Everything is also very simple: mount -t ramfs -o size=1024m ramfs /tmp/ramfs , here we allocate 1Gb of memory and mount it as a file system in the /tmp/ramfs .

Install TmpFS :
Also, as in the past case, this is a creation, not an installation. Created by the command mount -t tmpfs -o size=1024m tmpfs /tmp/tmpfs , which shows that we also allocate 1Gb of memory and mount it as a file system in the /tmp/tmpfs .

The note:
The main differences between RamFS and TmpFS that interest us are as follows: both methods work almost the same, except that when the allocated limit for RamFS is reached, the system will not inform us that the used volume has been reached and the new data you write will simply disappear nowhere, and TmpFS (also when the selected limit is reached) will display a message about the lack of space, while older data will be moved to the swap, that is, disk operations will be performed, which actually does not suit us. Both RamFS and TmpFS will be reset to zero and disappear when the system is restarted, therefore, if you want to use them and after the system is restarted, you need to put the partition creation script into autoload.


Installing XCache [1.3.1] :
And here begins the interesting thing we are facing. The installation of this package turned out to be quite pragmatic due to the fact that PHP 5.2 was also installed on the system (which is the main system on the system at the moment), XCache has a large list of dependencies, so the installation of this package was problematic due to the need to slip he needs libraries. But even with this, the problems with this extension did not end, since after installation (and it seemed to be normal), when the script was run with a test to check the correctness of this extension, new problems arose, namely, when trying to write a variable to memory an error occurred:

Warning: xcache_set(): xcache.var_size is either 0 or too small to enable var data caching in /usr/local/php5.3/xcache.php on line 6

and next was the following:

Warning: xcache_get(): xcache.var_size is either 0 or too small to enable var data caching in /usr/local/php5.3/xcache.php on line 10

After 2 days of dealing with this error with the help of direct hands, a keyboard and a tambourine, the result was zero. There were guesses that * .so was incorrectly compiled, but replacing the * .so collected by us with the same version downloaded from the Internet did not lead to anything. Uncle Google went to the course, exactly the same StackOverflow problem was found. However, the solution to this problem was not found, so they were forced to conduct further tests without XCache. There was no particular disappointment, because previous tests showed that the difference between APC and XCache is almost insignificant, but there are much more problems with XCache.

Installing MongoDB [1.8.1] :
Installing Mongo is quite simple and I will not describe its process, however, I will give you a start with the necessary configuration:
/usr/local/mongodb/bin/mongod --dbpath /usr/local/mongodb/data --profile=0 --maxConns=1500 --fork --logpath /dev/null --noauth --diaglog=1 --syncdelay=0
The main parameter that interests us is the last " --syncdelay=0 ", which indicates the time of synchronization of data stored in the memory with the data on the HDD. The specified value of 0 indicates that we explicitly prohibit Mongo from synchronizing with HDD, this feature is described by the creators of the DBMS, however it is not recommended due to the fact that during any electricity jump or other system failure that could affect the Mongo daemon, all your data will be lost . We are quite satisfied with this risk, because we want to try to use this DBMS as a caching mechanism.

It seems everything is installed and "configured", now proceed directly to the tests themselves.

For greater honesty, we will conduct tests with 4 stored data volumes: 4Kb, 72Kb, 144Kb, 2.12Mb .

Consider the data obtained in the graphs:

4Kb:
image
As can be seen on the 1st graph (4Kb data recording timeline) - MemCache showed the worst result, after it there are RamFs and TmpFs, then Mongo and APC.

image
On the 2nd graph (Timeline of reading 4Kb data), you can clearly see how Mongo grazes back, while MemCache is almost comparable to RamFs and TmpFs, which almost coincided, just on this graph, the discrepancies between them are almost imperceptible.

72Kb:
image
From the 3rd graph (Graph of 72Kb data recording time), you can see how MemCache behaved inadequately, and Mongo is quite predictable. It is very surprising that the APC, RamFs and TmpFs almost matched.

image
But the reading displayed on the 4th chart (Timeline of reading 72Kb of data): Mongo passed the position and missed his competitor MemCache ahead, as before APC - the leader of the whole race.

144Kb:
image
Starting from the 5th graph (Timeline for recording 144Kb data), we already see the strange behavior of almost all systems - MemCache behaves very unstable and sometimes, at a smaller number of iterations, time is even longer than at a larger one, and even more surprising is that it is closer to 500 iterations Mongo starts to overtake APC! RamFs and TmpFs hold a surprisingly high level in speed.

image
On the 6th graph (Timeline of reading 144Kb of data) Mongo loses everything, and APC is the quickest.

2.12Mb:
image
On the 7th graph (Graph of recording time 2.12Mb of data), Mongo turned out to be very stable, but the last iteration 500 times knocked him out of a rut, but he kept himself in the saddle. Of course, the obsalyutny leader is APC, but something interesting happens with MemCache, TmpFs and RamFs, which will be described below.

image
The last graph (Timeline reading 2.12Mb data) is also a very wonderful picture: TmpFs is the slowest, and RamFs is the fastest. Another amazing moment that Mongo bypassed APC!

If you look at the latest iterations of the last 2 graphs in which 2.12Mb files were cached 500 times, you can notice a rather interesting thing that 500 (number of iterations) * 2.12 (single file size) = 1`060Mb, which tells us that in RamFS and TmpFS we went beyond the maximum volume limit! Therefore, the numbers are quite interesting and unpredictable.
In fact, everything turned out just as predictable: when the upper limit is reached (and we have 1024Mb), RamFS simply stops recording data and ignores write commands. When reading this data, physically reading does not occur, but simply returns an empty string (when processing in PHP, the string is not empty, but is interpreted as null), and TmpFS at this time behaves just the opposite - it writes all our data, pre-selecting for them place by moving older data in swap. This explains the fact that with such volumes and similar numbers of iterations, RamFS takes a rather short time to process the recording and even less time to read non-existent data, and TmpFS, on the contrary, greatly increases this time due to the need to perform disk operations.
As I indicated above, all funds were taken “from the box”, these funds were not set up, therefore memcache time is 0, because The recording volume of one record with a size of 2.12Mb simply exceeds the maximum amount of one stored record in memcache. In order for memcache to store data that is more than 2.12Mb in size, memcache itself needs to be rebuilt, but then everything would have to be adjusted accordingly, but different products are equally unrealistic to set up the same way, so I allowed myself to leave this nuance in as he is now. Perhaps this is not true, but we were more interested in the behavior of all these funds under equal conditions - “from the box”.

Looking at all this data, each developer can make his own conclusion and his own conclusions on what and how to use. We have drawn our own conclusions, which we will share:
Due to the fact that we are interested in the data of those tools that allow you to cache over the network, we will discuss MemCache and Mongo. Of course, in many cases, as seen in the graphs, MemCache bypasses Mongo when reading, which is more relevant than writing in which Mongo bypasses MemCache. However, if we take into account all the possibilities that MongoDb gives us, which is certainly understandable, because this is a full-fledged DBMS with all its capabilities and simplicity in working with it, then the shortcomings in speed that Mongo shows you can easily be blocked by its ability to make complex samples and getting multiple entries (cache entries) in one request. One more plus of MongoDb in front of MemCache: with proper and well-thought-out App architecture, you can get cached page elements in one request, and after receiving a response from Mongo, analyze the received information and perform the necessary actions and caching of the corresponding blocks.

You can also think over the system in such a way that there is a multi-level cache: the first-level cache (network cache) is MemCache or Mongo, and the second-level cache is APC or XCache (but a comparison is also required).

In the near future, a more in-depth analysis and comparison of MemCache and Mongo is planned - a comparison of not only the elapsed time, but also the elapsed amount of memory with the measurement of the load on the processor and on the server as a whole.

Finally, I would like to point out that the tests were run repeatedly and the data in the tables is the average of 10–20 repetitions of the tests.
I will not give tabular test data due to the large volume.

Source: https://habr.com/ru/post/124212/


All Articles