📜 ⬆️ ⬇️

Frontend caching: Flask, Nginx + Memcached + SSI

For a long time, the following articles on this subject caught my eye:

I'm friends with PHP, so I tried the examples and made sure that it works. But all this had “fatal flaws” :) - PHP, and I am a Python fan and am mainly involved in backend work. Seriously speaking, it was not possible to put it into practice.

However, at the beginning of the year, a proposal was made to participate in one ambitious project, initially implying HiLoad and other buns from this opera. While business plans were being compiled, investors and other similar cases were looking for, I decided to study issues that in my opinion would be useful in this work, including caching issues.

First of all, a draft solution was implemented for my favorite framework, Flask, which uses the Varnish + ESI stack for caching. It worked and even showed good results. Later it came to the realization that Varnish is probably an “extra player” and that’s all, and you can even get more flexible on the Nginx + Memcached + SSI bundle. This option was made, there was no noticeable difference in performance, but the latter seemed more flexible and manageable.
')
That project did not even taxi to the runway, or taxi but without me. Thinking, I decided to “comb the code” and put it in the OpenSource and in the public court.

I will not describe in detail the principle of caching fragments of pages. In the above articles, it is described well enough, and Google with Yandex will help to find even more information. I will try to focus more on a specific implementation. In my case, this is Nginx + Memcached + SSI and Flask using the extension written by me.

In short, the principle is described in several sentences. The result of the function that generates a webcast fragment is placed in memcached with the key usually represented as a URI uniquely corresponding to this fragment, and the page itself displays a line of the form <! - # include virtual = "<URI>" -> , where <URI> - the key value for which the real content is put in the cache. Further, “specially trained” Nginx having encountered this instruction when proxying replaces it with real content received directly from the memcached server.

Consider the example of a typical site, where each page has a block, in which a greeting to the user and the number of posts and comments made by him are displayed. Counting the number of user messages is quite a costly operation, and if we also print a graph of friends there, then only this fragment will significantly decrease the database and, consequently, the overall page loading speed. But there is a solution! You can cache the content of this block as described above and queries to the database will not be made every time a user opens a new photo in an album. Nginx will give this block "without straining" bakand. The application remains to update the content in the cache, if the user created a new post or wrote a comment.

This approach is different from the typical one when the application itself selects data from the cache and displays it on the page in that Nginx is now responsible for this, and Nginx is the thing! Which is incomparable in the speed of content delivery with none of the frameworks known to me.

Practical part


The extension code is not particularly sophisticated by me called Flask-Fragment and published on GitHub under the MIT license . There are no tests, there is no documentation, but there is a fairly functional demo application representing the “lite” version of the blog. If it will be interesting to someone else besides me, I plan to make some expansion of the API, support for the Varnish + ESI variant and of course the tests and documentation.

Enable caching

To select a fragment and its subsequent caching, you need to create a function that generates only the required part of the page. Mark it as responsible for fragment generation by the fragment decorator. For its functionality meets the extension Flask-Fragment, one must be connected. Such functions, I will continue to call them fragment view , can take the parameters they need, and the output should give the content suitable for insertion into the web page.
 from flask import Flask from flask.ext.fragment import Fragment app = Flask(__name__) fragment = Fragment(app) @fragment(app, cache=300) def posts_list(page): page = int(page) page_size = POSTS_ON_PAGE pagination = Post.query.filter_by().paginate(page, page_size) posts = Post.query.filter_by().offset((page-1)*page_size).limit(page_size).all() return render_template('fragments/posts_list.html', pagination=pagination, posts=posts) 

In the template of the main page the fragment call is made in the following form:
 <div class="content"> {% block content %} {{ fragment('posts_list', page) }} {% endblock %} </div> 

Now, when you first call the fragment with the page=2 parameter, the result of the posts_list function will be placed in the memcached cache with the fragment:/_inc/posts_list/2 , and the instruction for Nginx will be inserted into the page. It will look like this:
 <div class="content"> <!--# include virtual="/_inc/posts_list/2" --> </div> 

In addition, the fragment:fresh:/_inc/posts_list/2 key with a value of 1 will also be placed in memcached. An extension intercepting a call to the posts_list function will not launch it to generate content while this key is in the cache and has a value> 0.

The TTL for the fragment:/_inc/posts_list/2 key fragment:/_inc/posts_list/2 will be set to 300 (we defined it in the fragment decorator’s cache parameter) + FRAGMENT_LOCK_TIMEOUT value set in the configuration, by default 180. And the fragment:fresh:/_inc/posts_list/2 TTL fragment:fresh:/_inc/posts_list/2 only on the specified value is 300. After this, Nginx, having encountered the instruction in the code, <!--# include virtual="/_inc/posts_list/2" –> will take the content of this fragment from the memcached cache without accessing the application for 480 seconds. In principle, Nginx will not wait for a TTL expiration situation, the application will update the content after 300 seconds, when the fragment:fresh:/_inc/posts_list/2 key no longer exists.

Cache reset

So the fragment is cached. By the way, the example above is taken from the demo application that comes with the Flask-Fragment package, it generates a list of posts with the number of comments to each of them. Accordingly, when a user adds a post or comment, the content of the list in the cache will be irrelevant. It needs to be updated. Below is an example of flask view which is called when a post is added.
 @app.route('/new/post', methods=['GET', 'POST']) @login_required def new_post(): form = PostForm() if form.validate_on_submit(): form.post.author_id = current_user.id db.session.add(form.post) db.session.commit() fragment.reset(posts_list) fragment.reset(user_info, current_user.id) flash('Your post has saved successfully.', 'info') return redirect(url_for('index')) return render_template('newpost.html', form=form) 

There are two calls to the fragment.reset method. The first fragment.reset(posts_list) resets the cache for the fragment view posts_list , the second fragment.reset(user_info, current_user.id) resets the cache for that block with the user greeting, which I gave as an example at the beginning of the article, as it displays the total number of posts and user comments. This fragment is uniquely addressed by the URI / _inc / user_info / 21, where the last digit is the user's userid . The extension organizes the reset of the key itself, forming it based on the parameters passed to fragment.reset .

The situation is worse in the first case, it uses pagination and there will be as many keys reset as there are currently generated pages for the list of posts. For example fragment:fresh:/_inc/posts_list/2 , this is only the key to reset the second page. It cannot do without the intervention of a higher mind. Below is a function code that performs a specific cache reset fragment view posts_list .
 @fragment.resethandler(posts_list) def reset_posts_list(): page_size = POSTS_ON_PAGE pagination = Post.query.filter_by().paginate(1, page_size) for N in range(pagination.pages): fragment.reset_url(url_for('posts_list', page=N+1)) 

Here, the fragment.resethandler decorator is used to determine the “custom” handler, in which the cache is reset for each page of the list of posts using the fragment.reset_url method.

In conclusion, I will present another block of code, these are the methods of the flask extension itself, which illustrate the key part of the functionality associated with the formation and writing of the contents of fragments in the cache.
 def _render(self, url, timeout, deferred_view): if self.memcache and timeout: if not self._cache_valid(url): self._cache_prepare(url, timeout, deferred_view) return jinja2.Markup('<!--# include virtual="{0}" -->'.format(url)) else: return jinja2.Markup(deferred_view()) def _cache_valid(self, url): return bool(self.memcache.get(self.fresh_prefix+url) or False) def _cache_prepare(self, url, timeout, deferred_view): successed_lock = self.memcache.add(self.lock_prefix+url, 1, self.lock_timeout) if successed_lock: result = Compressor.unless_prefix+(deferred_view()).encode('utf-8') self.memcache.set(self.body_prefix+url, result, timeout+self.lock_timeout) self.memcache.set(self.fresh_prefix+url, 1, timeout) self.memcache.delete(self.lock_prefix+url) 

As you can see, an attempt is made to create a lock key. This prevents race condition. Information in the cache is only updated by one thread that has managed to set the lock, the rest execute the default silence script and while the old data is returned to the client.

Conclusion

What did we get? And we got a serious unloading of the frontend and the database, which is clearly seen when running the demo application in the DebugToolbar panel. Later, I plan to put in the repository a load test, made on the assumption that the blog user generates only 5% of requests to add posts or comments, the rest is viewing. However, if you fill two or three dozen posts with two or three dozen comments to each, then on a weak virtual machine the difference is already noticeable.

Caching can be turned off by setting the value of the FRAGMENT_CACHING parameter in the config to False . In this case, the application can work without proxying through Nginx, the extension will insert the real content of the fragments independently.

Thank you for your attention, I hope the article was interesting not only for web programmers who are interested in Python, but also for anyone interested in improving the performance of web applications. I also hope that I contributed to the popularization of the wonderful framework Flask .

Source: https://habr.com/ru/post/191788/


All Articles