Django ORM - slow? Optimize (hardcore)

Autumn came, and, despite the good weather, our viewers reached out for new video content. Backend servers serving video embeds began to run into CPU. With system calls, system administrators came running and began to take laptops and desktops from the development department, threatening to put them in data centers "for gain". The development of this course did not like everything and ~~everyone was fired~~ With this crap decided to do something.

In general, there are a lot of solutions in the highload world, starting with the wild "let's rewrite the most difficult things on Go" and ending with the banal caching of everything. However, this time we will talk about the acceleration of what you can not rewrite and not caches - the business logic of the process of display resolution of the video.

The video display logic for five years of implementing business requirements absorbed the targeting of sponsored links, video delivery protocols, statistical counters, player appearance and much more that depends on geographic location, browser, device type, site where video embed is located , video content metadata - a total of about a dozen parameters.

As a result, for each specific viewing, you need to make about a dozen requests to MySQL, the formatted version of which does not fit on the screen. Queries go to small tables, they are executed quickly, but the bottleneck is the formation of such queries using Django ORM. After sitting for half an hour with the profiler and looking at the results, the author found out two things. The first is that using the django-rest-framework in such a complex API pushes the programmer to copy-paste and re-call the same functions within the same request. Record - 6 times parse the referrer URL into subdomains and path segments. The second is that inside Django ORM ~~really slow~~ it takes up half the time to process the request.

Before that, it was necessary to speed up the ORM, they reached quite quickly, and they thought of the sane implementation literally the other day. Prior to this, various options were analyzed.

str.format ()

Using line formatting queries would solve our performance problems right away and fundamentally, but would require rewriting dozens of pages of code, depriving developers of sleep for several months, and managers no hope that it will ever be released into production. The option was dropped immediately.

SQLAlchemy + Baked Queries

Changing ORM to another, faster and with built-in support for caching, could be an excellent option if this were confused at the start of the project. And so, despite the presence of tests, the question arises about the equivalence of rewritten code. Baked Queries from SQLAlchemy allows you to use a cache in the process memory instead of building a query from scratch to reuse a once-formatted SQL query when you retry calls with different parameters. This idea was one of the most "delicious" until the version that caused the publication of this article.

Cache for Django ORM

It is clear that the implementation of caching inside Django ORM will not work, but what if you try to cache an already formatted SQL query? In theory, it looks very simple:

 queryset = get_some_complex_sql(flag1=True, flag2=True) sql, params = queryset.query.sql_with_params() raw_queryset = Model.objects.raw(sql, params=params)

We take the memoization technique, modify it so that not specific values are taken into account, but their types, presence / absence (OK, (True, False, None, 0, 1) can also be taken into account). We remember SQL, in the case of cache hit, we substitute new values in RawQuerySet and our cache is ready.

In practice, problems begin almost immediately. params is always just a tuple of python built-in types, and it is impossible to distinguish where flag2 is in it and where flag2 is. In addition, any filter of the type filter(value__in=[1,2,3]) modifies the SQL query depending on the number of values passed to the filter, and this leads to a combinatorial explosion of caching keys.

Having accumulated an insane amount of coffee and cookies, the problem of matching the actual parameters to the formal ones was solved by circumventing the internal Query structures and searching where another parameter is inserted in them. The result was a working, but terrifying construction.

I am ashamed of this code ...

 @cached(   'play_qualityrule',   #      licensed=not_null_and_negate('licensed'),   protected=not_null_and_negate('protected', default=False),   is_active={'exact': True},   # QualityRule.rightholders.filter(rightholder_id=...)   rightholders__rightholder_id=null_or_equal('rightholder'),   # QualityRule.user_agents.filter(useragent_id=...)   user_agents__useragent_id=null_or_equal('user_agent'),   # QualityRule.groups.filter(group=...)   groups__group_id=null_or_in('group'),   # QualityRule.alternative_sales_rule.filter(   #    alternativesalesrule=...)   alternative_sales_rule__alternativesalesrule_id=null_or_equal(   'alternative_sales_rule'),   # QualityRule.users.filter(user_id=...)   users__user_id=null_or_equal('user')) def get_filtered_query(self, **kwargs): ....

That is, in fact, any parameter involved in the construction of an SQL query should have been described in the decorator along with the rules for its formation relative to the arguments passed to the function. This jumble of code did not suit the complexity of the initial implementation and subsequent support.

Promise and lazy calculations

The next approach was inspired by django.utils.functional.lazy . It works as follows.

 def compute(param): return param ** 2 compute_lazy = lazy(compute, int) lazy_value = compute_lazy(43)

If lazy_value is passed to Django ORM, then until the last moment, the call to the compute() function will be postponed; the function will be called each time the Promise object is explicitly cast to a real value.

This property was used to get the actual values of the parameters from the context manager.

 def lazy_param(name): return ContextManager.instance.params[name]

The following idea was naively simple:

wrapping all the function parameters in a lazy proxy
cache the queryery
...
profit!

Unfortunately, it did not work out. It turned out that Django-ORM, for some dirty purposes, performs type casting and checking list lengths even at the QuerySet design stage: for example, in the call to filter(a__in=[1,2,3]) value is checked, and the corresponding the node, instead of adding a new condition to WHERE, throws an EmptyResultSet , such as to optimize the query, ~~infection~~ .

Accordingly, there is no place left for any lazy calculations.

Lazy!

It was a pity to throw out such a great idea, so it was decided to write a class that remains lazy, no matter what happens.

When casting to str / unicode, LazyUnicode is returned
When cast to int, LazyInt is returned.
At attempts of iteration the iterator with one element is returned - LazyList.

Due to all this "laziness", it was possible to bring the Lazy wrappers to complete the formation of the SQL query, simultaneously solving the problem with a variable number of placeholders for checking the entry into the list.

 sql, params = "SELECT * FROM bla WHERE a IN (%s) AND b = %s", ([1,2,3], 4) placeholders = get_placeholders(params) sql = sql % placeholders params = flatten(params) # SELECT * FROM bla WHERE a IN (%s, %s, %s) AND b = %s", (1, 2, 3, 4)

When substituting the actual parameters, the expression IN (%s) is replaced by IN (%s, %s, %s) with the number of placeholders corresponding to the actual number of values in the list, and the params tuple is made flat.

Now the code using caching looks much more elegant.

  @cached def get_filtered_query(self, **params): ... return queryset def useful_method(self, **params): with LazyContext(**params): qs = self.get_filtered_query(**params) # query database return list(qs)

As a result, we defeated the slow ORM by caching SQL query texts, dealt with the combinatorial explosion of the number of caching keys, retained the business logic that was almost intact and retained all the capabilities of Django ORM to form queries to the database.

What's the price?

The fact that business logic has remained intact is marketing. In practice, it was necessary to conduct a serious refactoring of the code, associated primarily with the removal of all possible branches for caching. Why this is important can be demonstrated by example.

 @cached def get_queryset(user): if user and user.is_authenticated(): return Model.objects.exclude(author=user) return Model.objects.filter(public=True)

It can be seen that the SQL query depends on two facts:

user passed to function
This user is authorized.

That the user is passed to the function, the decorator is able to track in the state; however, checking the authorization is not in its competence, as, for example, the fact that the ID of this user is a prime number. Such cases of branching, based on method calls from the passed objects, on additional queries to the database and the state of the global context of the function (yes, even datetime.now() ) - all this needs to be bracketed. Fortunately, the rules for this are quite simple:

do not transfer model objects (ID only),
do not pass empty lists (None instead),
do not use values that cannot be calculated from the arguments passed.

The example above is changing quite a bit.

 @cached def get_queryset(user_id, is_authenticated): #    ,   Lazy-   if reveal(is_authenticated): return Model.objects.exclute(author_id=user_id) return Model.objects.filter(public=True) def caller(user): if user and user.is_authenticated(): user_id = user.pk is_authenticated = True else: user_id = None is_authenticated = False with LazyContext(user_id=user_id, is_authenticated=...): qs = get_queryset(user_id, is_authenticated) return list(qs)

Instead of conclusion

In our case, the story of optimizing the backends started suddenly, and we were very lucky to find another bottleneck, which allowed us to tackle this problem leisurely and not screw the first available solution. The unit tests helped a lot where they were. In production, the changes roll out with a gradually decreasing level of paranoia.

At the start, the combat traffic is mirrored to a separate backend, in order to compare the results of the original and modified version of the code
The tests use the "paranoid" mode of operation, when along with the "lazy" implementation, the "normal" version of the code works, and if the SQL queries or their parameters do not match, this is considered an error
A more sane mode compares the reference implementation with a lazy one when saving the result to the cache, and in case of discrepancies it turns off caching
Finally, a set of feature flags allows us to quickly disable the SQL query cache for a particular method, if we notice that the code does not work correctly.

As for efficiency, on synthetic runs the ab of the same query managed to accelerate from 31 to 44 requests per second. The result was obtained, firstly, for a specific business logic, secondly, in a wildly synthetic situation, and thirdly, on the first available machine. And what is important, taking into account rounding, we received an acceleration of 42% .

The query cache implementation is available on GitHub .

Source: https://habr.com/ru/post/339420/

All Articles