Optimization of flatpages project on django under the minimum system requirements. Joke article

There are a lot of letters under the cut, but don't worry - you all know them.

Prehistory

That's more interesting
')
Once upon a time, we with my beloved tried to make their first small project. Then I was engaged only in design, so the programmer had to be hired: I gave him layouts, he gave us the layout and the application itself. I remember hosting cost us about 130-150 rubles - of course it was LAMP.

As it is, usually, the first pancake happens. The time has passed and we are ripe for the second project. Now I could do everything myself. Throwing over the design, I started the layout, and then the application itself. This time, nothing substantial: a simple website with a bunch of statics is not even interesting.

On a hosting, I habitually figured out this way: open the tariffs page on a VPS, choose the first line from the bottom - well, somewhere between 800-1000 rubles. "So much? We used to pay 200 rubles ... ”- the spouse was puzzled. Indeed, why more something? The answer, of course, is obvious: VPS, how not to twist, junga there, here and there, “this is not some kind of joy for you!” - but what about Joomla? Joomla, won: 100 rubles. live a month happily. But this is too simple an answer, so not interesting.

Task

And so we have a task: the minimum VPS and the finished project on django are statics, a couple of forms, an admin panel, and so on. Synthetic minimum: 200 MB of RAM, 500 MHz percent, on a screw - well, let it be two gigabytes for everything along with the system.

DB

The bottleneck in the problem: RAM. Take sqlite. Wait. What ?! I'm serious, he is not so bad . We compensate its shortcomings with a cache

Cache

We take all the most powerful: nginx + memcached and teach you how to work with jung.

Web server

It will be uwsgi - fast and fashionable. For fans of spherical-vacuum tests, you can look.

Sessions

Sessions need to be stored somewhere. I usually use redis, but not here. Memkeshd is also not suitable: given the caching architecture, we will often reset the entire cache, and with them the sessions will be deleted . Comes to the aid of a new jung 1.4. This is a cookie-based session. The principle is simple: all data about the session is stored in cookies by the user. The data itself is not encrypted, but integrity is ensured by a cryptographic signature. More can be read at the docks .

Content

It's all just take MarkDown. "Why Marcown?" He's cool. He's awesome. Even this article I write on it. It will help us ease the size of the database, it does not contain tags. The text in MD, sometimes, is half the easier way to do it from the fckeditor alike editor and is 100 times more pleasant to the eye. In addition, the text should be formatted with the styles of the site, not the editor: some styles rule everything - just like Sauron.

Soft

Take Ubuntu server minimal to get the newest and fastest software: nginx, memcached, python 2.7. In general, it is better to do something easier like slitaz, puppy linux, slax - if only because the minimum for ubunt is 128 MB of RAM, and the tuned “base system” + some software I have had more than a gigabyte on the screw. Server distribution, sir . But we decided to enjoy the article, in general, we will know the measure.

Development

Django

First we need to set up our config junga:

#     . #   ,       js. SESSION_ENGINE = 'django.contrib.sessions.backends.signed_cookies' SESSION_COOKIE_HTTPONLY = True # https://docs.djangoproject.com/en/1.4/ref/settings/#std:setting-USE_I18N #    .    : This provides an easy way to turn it off, for performance.       ,     ,      USE_I18N = False # https://docs.djangoproject.com/en/1.4/ref/settings/#use-l10n #       .    pytils    . USE_L10N = False # https://docs.djangoproject.com/en/1.4/ref/settings/#use-tz #      USE_TZ = False #  ,     .      . TEMPLATE_CONTEXT_PROCESSORS = ( # default template context processors "django.contrib.auth.context_processors.auth", # "django.core.context_processors.debug", # "django.core.context_processors.i18n", # "django.core.context_processors.media", "django.core.context_processors.static", #"django.core.context_processors.tz", "django.contrib.messages.context_processors.messages", # required by django-admin-tools 'django.core.context_processors.request', 'orangetrans.utils.context_processors.common', ) # Sentry     ,     . # include_html     ,       . #        . # https://docs.djangoproject.com/en/1.4/topics/logging/#django.utils.log.AdminEmailHandler LOGGING = { … 'handlers': { 'mail_admins': { 'level': 'ERROR', 'filters': ['require_debug_false'], 'class': 'django.utils.log.AdminEmailHandler', 'include_html': True, } }, … } #       ""  SEND_BROKEN_LINK_EMAILS = True

Cache

This is the strongest part of the project. Kesh will be our shield in front of a slow backend. Many, of course, have already guessed what it was about.

How will this work? It's simple: the user requests the page, nzhinks looks at whether this page is in the cache, if not, janga generates it, puts it in the cache and gives it to the nzhinksu, next time the frontend will give it away. Those. in fact, it’s enough to go through the site once and all subsequent times our site will work with the speed of memkeshd + nzhinks.

Really cool? Well, almost, this type of cache is well suited for our case, for a complex project it has many drawbacks. For example, we cannot update the form filled in by the user, because every time entering the URL, the user will receive the same form from the cache: without the completed data and received errors. We do not need praise for fashion :)

Django

The form will be sent to us on Ajax, we just need to show it once, and the data and errors we will have to go to js. Fallback option without Ajax (you always do it, right? :) we will not cache. There is still a moment, the form works for us through a post. Starting with some version of jung (1.2?), Protection against csrf attacks has become mandatory for POST requests. In our case, this is a callback request: one number field and no personal data. To disable protection for a specific view, you simply need to wrap it in the csrf_exempt decorator.

To implement our plan, we write middleware. On the Internet there is an article on how to implement the work of such a bundle. It is written for previous versions of junga and I did not understand without viewing the raw materials. I will correct and silently update it. So in the middleware:

 import re from django.core.cache import cache from django.conf import settings class NginxMemCacheMiddleWare: def process_response(self, request, response): url = request.get_full_path() cache_it = not settings.DEBUG \ and request.method == 'GET' \ and response.status_code == 200 if cache_it: stoplist = [ x for x in settings.CACHE_IGNORE_REGEXPS if re.match(x, url) ] if not stoplist: cache.set(url, response.content) return response

We only cache GET, do not cache during development and 404. And why? That is why bots in search of the admin or the attacker himself can clog the memory with any rubbish.

Let's go back to the config:

 CACHES = { 'default': { 'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache', 'LOCATION': 'unix:/tmp/memcached.sock', 'KEY_PREFIX': 'YOURSITE', } }

Everything is simple: backend memkeshd, socket and prefix. By the way, pylibmc I could not get through a socket, something he did not like, and all the same. I took python-memcached.

 CACHE_IGNORE_REGEXPS = ( re.compile(r'/admin.*'), re.compile(r'/some_url.*'), )

This is a list of URLs prohibited for caching, which was mentioned just above. Do not forget to add our new middleware in MIDDLEWARE_CLASSES in settings.

By the way, you can use the django-memcache-status program to monitor the memkeshd status . She draws a nice scale of used allocated memory for memkeshd in the admin panel and shows a lot of information. To be honest, I didn’t find a special use for her :) just dilute the already boring admin panel for flatpage. Unfortunately she is not friendly with django-admin-tools.

Nginx

Config for server dev, later we will update it:

 upstream dev { server 127.0.0.1:8000; } server { listen 80; server_name dev.local; charset utf-8; location ~^/(media|static) { root /home/user/project; access_log off; break; } location / { if ($request_method = POST) { proxy_pass http://dev; break; } default_type "text/html; charset=utf-8"; set $memcached_key "YOURSITE:1:$request_uri"; memcached_pass unix:/tmp/memcached.sock; error_page 404 502 = @fallback; } location @fallback { proxy_pass http://dev; } }

See this line: 'set $ memcached_key "YOURSITE: 1: $ request_uri";'? This is where all the magic happens. Each page is stored in a cache with a key prefix + cache version + full URL. It is enough to open the page once, janga will put it in the cache after the key, and the next time the nzhinks will get it by this key. Even if you turn off the jungle site will be operational. Post requests we immediately pass back to the backend without twitching the memkey. Another point: I use het keys for fallback mode if the user has js disabled (I have noscript: P): there are all sorts of tabs, so that's why request_uri instead of uri is the last URL without het keys.

Starting with jangi 1.3, we don’t need to form the whole key by ourselves. It is enough to pass the unique part to the set (), get (), delete () methods, and the rest will do the janga on its own. The unique part in our case is the absolute URL of the page.

With our usual gzip, there is a small problem: nzhinks do not compress, what takes from memkeshd. I found on the network a 4-year-old patch for version 0.6 of the Nzhinks, and it was already 1. * in the yard. If you do not serve IE6, you can turn on compression in the jung itself and already give compressed content. In our case, the starper browser is important. For everything else, gzip will work.

Memecached

Let's say memcached work through a socket, /etc/memcached.conf:

 #-s unix socket path to listen on (disables network support) -s /tmp/memcached.sock #-a access mask for unix socket, in octal (default 0700) -a 0777

Statics

I use jquery and html5shim for IE. Praise Google, it will ease our lot:

 <script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script> <script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script>

The rest of the statics squeeze using django-compressor . I use less - django-compressor can also compile it in css. If you also use less, do not forget to put node.js on the server and the compiler itself. For all statics we will enable compression and glue together: js with js, css c css. Beauty.

The application also gives the files unique names, so when the source changes, new ones with new names will be compiled, so the user will download them in a new way and won’t ask the question “why does everything look so strange?”. By the way, junga 1.4 also can.

Local test

Ok, we are almost done.
Take your favorite virtual machine, install the system and deploy our project. At the same time, let's prepare everything for production: let's update the requirements for pip, which, for some reason, are always irrelevant and configure the uwsgi-server. There is one thing with the processor: the virtual box and the player do not know how to give a certain amount of hertz, we will limit ourselves to one core.

On setting up a uwsgi server, I recommend reading the docks and another article on welinux . It makes no sense to explain this here: take the example of the config from the docks and launch it. The only thing I recommend is to specify virtualenv , because most of the problems associated with the pythonpath and the environment - oh, how much time I killed until I guessed to look at the docks. Judging by Google: I am not the only one, the people are perverted as they can: they add ways to the environment from scripts, etc. And all the good old rtfm.

And so, everything is ready. Let's open the private mode in the browser so that everything is fair, we will measure for how much we get the page:

 18 requests ❘ 284.93KB transferred ❘ 1.11s (onload: 1.12s, DOMContentLoaded: 979ms)

Turn on the cache, generate the cache, open a new private browser:

 18 requests ❘ 298.09KB transferred ❘ 291ms (onload: 293ms, DOMContentLoaded: 150ms)

Perform a POST request (then I will explain why). It executes get_or_create a couple of times:

 21 ms.  .

As you can see there is a gain, as well as a little more weight without gzip.
Cool? Cool.

Can be cooler

Here is what we do:

since version 1.3 junga collects all the statics of all applications in a folder with statics (manage.py collectstatic) - you can’t think of anything better. We put everything in memory;
bd we have this file and the application will access it as a file, and put it in memory;
we will adjust copying once an hour on the screw to sleep easy.

Static is 2 megabytes and the database will not grow more than a couple megabytes - we take 5 MB. Create a directory in memory:

 #    /etc/fstab: tmpfs /mnt/project/ tmpfs size=5M,mode=0777 0 0 #   $ sudo mkdir /mnt/project/ #     ,     : $ sudo mount /mnt/project/ # mode=0777    ,    $ mkdir /mnt/project/static $ mkdir /mnt/project/db

If you have correctly assembled your project, then all the static will be in the application directories. To do this, static should be placed in the static of each application, more and more in the docks . So janga can find and assemble them in the right place. We do:

 #          . #    ,    : STATIC_ROOT = COMPRESS_ROOT = '/mnt/project/static/' #     (--noinput    ,  -      ): $ ./manage.py collectstatic --noinput -c #    .     : DATABASES = { 'default': { 'ENGINE': 'django.db.backends.sqlite3', 'NAME': '/mnt/project/db/db', } } #   $ cp -r project/db /mnt/project/db/db #   $ killall -HUP uwsgi

Automation of all this at the start and backup crown will leave as a homework.

Update njinks config:

 upstream project { server unix:/tmp/project.sock; } server { listen 80; server_name project.ru www.project.ru; if ($host = www.project.ru){ rewrite ^(.*)$ http://project.ru$1 permanent; } location /static { root /mnt/project/; access_log off; expires max; } location /media { root /project/; access_log off; expires 5d; } location / { include /etc/nginx/uwsgi_params; if ($request_method = POST) { uwsgi_pass project; } default_type "text/html; charset=utf-8"; set $memcached_key "project:1:$request_uri"; memcached_pass unix:/tmp/memcached.sock; error_page 404 502 = @fallback; } location @fallback { include /etc/nginx/uwsgi_params; uwsgi_pass project; } }

Media we physically stayed where it was. Static moved, the rest we have already seen.
It is worth noting the line 'expires max' for static: this is not cool. And what if tomorrow we correct the picture, m? If you use django-compressor, you'll be pleasantly surprised to see this in css: /static/i/button-small.png?e59177bafc62.

Without cache:

 14 requests ❘ 284.93KB transferred ❘ 743ms (onload: 775ms, DOMContentLoaded: 708ms)

With cache:

 14 requests ❘ 298.09KB transferred ❘ 174ms (onload: 175ms, DOMContentLoaded: 140ms)

POST:

 17ms

There is a gain. Small, but there is. And with more attendance it will become noticeable.

Backups

Usually I'm so wasteful that I use Dropbox, but WebDav is more appropriate here . Simple and fairly detailed instructions you will find on MyDrive . The blessing now is full of services - choose. This will allow us to save space on the screw and take backups to the cloud at the same time.

So what have we done?

we saved resources by replacing mysql with sqlite (and spent them a little by putting the database in memory :) however until then we achieved great speed;
used cookie-based sessions - they are from the user, which means we should not care about their storage, cleaning and other nonsense;
we optimized the work of jangi, mainly admin panels, by turning off the translation, but this is the non-cacheable part, which means it is important;
nzhinks, janga and memkeshd communicate through sockets - I don’t know how much it will add, but it should be faster than through ports;
we used, probably, the fastest cache of existing ones and, if properly used, we can give odds in speed to more abrupt sites;
part of the static is placed outside, which means the site will load even faster: from different sources or from the local cache (something popular);
we squeezed our static, put it in memory and give it through the njinks;
backups? Checked.

So what's the joke?

It was interesting to me to solve the problem, just like that, out of curiosity and unaccustomed to work in close conditions. Basically, the solutions are harmless and even useful, but some are extremely sporty: the selected cache is impractical, and sqlite will not allow to greatly expand the project (back-office for call processing, for example). And in my right mind I won't save on the server, so all this is nothing more than a fan.

Look like that's it. Thanks for attention.

Source: https://habr.com/ru/post/142241/

All Articles

Optimization of flatpages project on django under the minimum system requirements. Joke article

Prehistory

Task

DB

Cache

Web server

Sessions

Content

Soft

Development

Django

Cache

Django

Nginx

Memecached

Statics

Local test

Can be cooler

Backups

So what have we done?

So what's the joke?

More articles: