Easy Queries with Django ORM Annotate and Query Expressions

There was once a time when ORM Django was considered very cute, but absolutely stupid. Although, the ability to produce Annotate and Aggregate have been there since time immemorial. And in version 1.8, the ability to apply database functions inside Query Expressions was added. And, of course, if a beginner jungist was not scared and read the introduction to these lines, he can safely read further: the article is aimed specifically at beginners.

Some time ago I had a task: to select values from a table by users. Moreover, these values must correspond to a certain regular expression. But this is not the end of the condition: from the selected expressions you need to pull out the substring. Again, on regular season. I did it pretty quickly, and I wanted to share the experience with those who still can not use Annotate and Query Expressions in practice.

I will try to describe the situation more precisely:

We have an almost standard Users model. Some users have different usernames. For example, manager, vasyaTheDirector, vovaProg, etc. But commercial users have names in the format {CountryCode} {RandomUniqueNumber}. For example, RU2525 or ES1672. Here we need to pull out from the database of all commercial users, but not all the information, but only the unique numbers without country codes.

The task is certainly interesting for beginners dzhangistov. Although, for mid-level developers, it may not be quite typical.

We begin with a simple one: for all users whose names begin with a two-letter country code, you can use the simple filter operation with the __irgex key on the field name.

from django.contrib.auth import get_user_model User = get_user_model() queryset = User.objects.filter(username__iregex=r'^[AZ]{2}\d+$')

We’ll get the following list:

[<User: RU123>, <User: RU124>, <User: RU125>, <User: EN123>, <User: EN124>, <User: EN125>, <User: EN126>, <User: UK123>, < User: UK124>, <User: UK1234>, <User: UK12345>]

Further more interesting. Django allows you to create annotations for the resulting values. For example, we need to count the number of Books that are associated with User by means of a ForeignKey. We can execute User.books.all () count (), or get the value immediately in the Queryset, using Annotate. We will declare the books_count field, which will be available to us, as a property of the resulting User instance, or as a dictionary key. Let's see how it will look not on an abstract example with books, but in the context of our task.

 from django.db.models import Func queryset = User.objects.annotate(username_index=Func()).filter(username__iregex=r'^[AZ]{2}\d+$')

Django has various functions for annotating values. For example, Max, Min, Avg, Count. They form part of the Query Expressions mechanism. These special expressions can be used both to describe the query and to change the values when they are received. Since version 1.8, we have the opportunity to use the built-in database functions. For example, we need to modify the received strings. So, we will use the functions associated with regular expressions.

I use PostgreSQL version 9.5, therefore, I need to find a function that will pull me a substring from a string. We find this function in the official documentation . The function is called: substring .

 from django.db.models import Func, F, Value queryset = User.objects.annotate(username_index=Func(F('username'), Value('(\d+)'), function='substring'))).filter(username__iregex=r'^[AZ]{2}\d+$')

As you can see, Func takes three arguments:

The field name wrapped in F () that we are modifying (in fact, the value of this field will be passed to substring)
The pattern by which the search for the substring
The name of the PostgreSQL function to which the previous arguments will be passed.

Well, we need to get the values in the form of a list:

 from django.db.models import Func, F, Value queryset = User.objects.annotate(username_index=Func(F('username'), Value('(\d+)'), function='substring'))).filter(username__iregex=r'^[AZ]{2}\d+$').values_list('username_index', flat=True)

We get this conclusion:

['123', '124', '125', '123', '124', '125', '126', '123', '124', '1234', '12345']

Accordingly, if we need to get unique user numbers for a particular country, change

 username__iregex=r'^[AZ]{2}\d+$'

 username__iregex=r'^RU\d+$'.

Well, now the fun part. What do you think, what SQL query does our code execute?

 SELECT substring("my_users_user"."username", (\d+)) AS "username_index" FROM "my_users_user" WHERE "my_users_user"."username"::text ~* ^[AZ]{2}\d+$

As you can see, the request is beautiful and does not need urgent ~~resuscitation~~ optimization.

Returning to the topic of the DJango ORM problems, indicated at the beginning of the article, I would like to emphasize that Annotate and Aggregate have existed in Django for a very long time. And, it turns out, just not everyone knew how to cook them. Although, the ability to execute Database functions without writing SQL queries is relatively recent. And we can do even more beautiful things.

PS
If you want to get data in a specific format, you can modify the code as follows:

 from django.db.models import IntegerField, ExpressionWrapper from django.db.models import Func, F, Value queryset = User.objects.annotate(username_index=ExpressionWrapper(Func(F('username'), Value('(\d+)'), function='substring'), output_field=IntegerField()))).filter(username__iregex=r'^[AZ]{2}\d+$').values_list('username_index', flat=True)

The output will be:

[123, 124, 125, 123, 124, 125, 126, 123, 124, 1234, 12345]

We wrapped Func () in ExpressionWrapper and specified the expected data type in output_field = IntegerField (). As a result, we got a list of integers, not strings.

Source: https://habr.com/ru/post/326278/

All Articles

Easy Queries with Django ORM Annotate and Query Expressions

More articles: