
As you already know from the
previous article , there are many ways to store information in App Engine. But many of them are very specific, and only three are suitable for universal use: instance memory, memcache and datastore.
Under the cat you will find a presentation in figures and pictures, brief recommendations on caching and source codes of a simple cacher and applications for tests.
Testing methodology.
For the tests, a separate application was created that could clog the database with various random data and read them in various ways (in the source code you will also find tests for pictures, but I decided to leave them outside of this article). Requests were played approximately at the same time with an interval of several seconds, the results were recorded using
AppStats , after that the average results were selected and
taken from the statistics.
Data.
The results of the random generator are used - in each subsection, first the results for one record with text of about 60Kb, then for a list of 1000 small records (approximately corresponding to the average commentary).
')
Datastore.
Let's start, of course, with the most reliable and slowest storage. We get from datastore one entry in 60Kb by the get_by_key_name () method:

It looks good - both in terms of execution time, and in terms of resources consumed, now let's try to get a sample of thousands of “comments”:

This is the main reason for non-scalable applications on a scalable App Engine - it takes 440ms of time to receive data, and the amount of computing resources consumed is simply indecent. At this rate, the application can consume a free quota for just a couple of thousand requests - you need to cache the data.
Note: 10 seconds of api_cpu are obtained by the all (). Fetch (1000) method on a group of 1000 entries, when executing GqlQuery, even more resources will be consumed to fetch from a larger number of entries. The largest I have seen is 561,000 api_cpu_ms when 1000 records are added to the HRDS (high replication datastore). So with large samples in App Engine, you need to work extremely carefully.
Memcache.
The most universal means is memcache, first of all, it is recommended to store the results of queries to the datastore in it.

Recording is almost 2 times faster and requires 2 times less CPU time. True, in absolute terms, a saving of 10ms does not seem serious. Now let's see how memcache digests the list:

Here the savings are much more substantial: 1.5 times faster and 15 times cheaper.
Memcache + protobuf.
Memcache is free and almost without limitations, but still not rubber - we will try to reduce the size of the data in it using the serialization of objects of the db.Model class in the protocol buffer (as recommended in
this article). This will allow data to be stored longer in memcache, but will the application work better? ..


If in the first case the difference is almost not visible, then in the second one clearly shows an increase in the execution time and consumed resources by 20-25%.
Memcache + protobuf + zlib.
Once we reduce the size, it is necessary to try and archive, for this we will compress the already serialized lines using zlib.compress.

The execution time of memcache.Get decreased by 1ms due to the fact that compressed text is transmitted much faster. However, all accumulated advantage is lost when unpacking data.

On a large number of small records there is no such advantage, only an increase in time and resources by about 5%.
Local memory.
So we got to the extreme means of improving performance - instance memory. In addition to many shortcomings (each instance has its own memory, this memory is only 50Mb, instances live extremely rarely up to the age limit of 9000 requests) it has the main advantage - speed.

As usual, for one record, saving is insignificant, but the list is quite another matter:

The execution time has decreased by 100 times, and computational resources are no longer required at all - a good argument for using the instance memory for caching everything in a row.
But let's not forget the disadvantages:
- Each instance has its own memory and, accordingly, its own cache.
- Memory of only 50Mb.
To solve the first one, you can think up some tricky mechanism for syncing caches. For example, store the hash of the cache in memcache and update the cache if it does not match. But it even sounds confusing - let's leave as a last resort and try better to reduce the amount of memory consumed with the help of the already familiar protobuf and zlib.
Local memory + protobuf.


For one record, time fluctuations are almost within the limits of error, but a serious symptom is noticeable on the list: the time and resources for the instance memory and memcache are almost the same. In this simple way, compression can destroy performance gains.
Local memory + protobuf + zlib.
And for order, retrieving data from the instance using zlib:


No miracle happened - getting the list still takes a lot of time and consumes a lot of resources.
Findings.
- Small single objects often do not make sense to cache - it is better to take an object by key or name from the database, rather than complicating the code for 10ms. A framework or non-optimal algorithm may consume more.
- Serialization and data compression should be used only for memcache, data in the instance's memory should be stored in a ready-to-use form.
- The instance memory is a great place to store frequently requested and rarely changeable data. If your application is a small site with the amount of data within 50MB, then you can safely use the instance memory for all "heavy" requests.
Application code
Download the test application code
here . It is better not to use the cacher from this archive in your projects; instead, it is better to take
from here a version optimized with regard to the data obtained.