It's no secret that SQLAlchemy is the most popular ORM in Python. It allows you to write much more advanced things than most fellows Active Record. But the charge for this is more complex code, and in simple tasks like CRUD it is annoying.
How I have made Alchemy convenient, using the experience of the best Active Record ORM, read under the cut.
I started using SQLAlchemy, having previously worked with Active Record in Ruby on Rails, with Eloquent ORM in PHP and with Django ORM and Peewee in Python. All these Active Record ORM have a concise code, and I really like them.
Alchemy is very good: it can build quite complex queries, and the Data Mapper pattern also rules. But, alas, after Active Record, the Alchemy code seems too complicated, and this is a price to pay for flexibility.
During the year I worked with Alchemy on a complex project, and realized that for complete happiness I want to have:
create
in Alchemy, you need to create an object, add it to the session , and make it flushPost.objects.filter(user__name__startswith='John')
__repr__
: print(post)
gives something like <myapp.models.Post at 0x04287A50>
, it is not suitable for debuggingI sadly recalled the Active Record ORM, where all this was (of course, in each ORM there were not all features, but most). As time went on, the project became more complicated, and in the end could not stand it and decided the tasks described.
The solution is packaged in a well-tested and documented package . Works on a combat project and feels very good.
Yes, there is, but they are either hard to implement, or sharpened for a specific framework, or poorly documented.
I wanted to have a universal, easily connectable solution, for example, to write
from _ import ActiveRecordMixin class User(Base, ActiveRecordMixin): pass
and have ready Active Record.
The options "initialize Alchemy only through me" and additions to flask-sqlalchemy are no good.
What is not satisfied with the specific packages, see here .
More details about each task - below in the article.
About examples in the article
I will give examples for a simple blog with typical entities User
, Post
, Comment
.
class User(Base): __tablename__ = 'user' id = Column(Integer, primary_key=True) name = Column(String) posts = relationship('Post') class Post(Base): __tablename__ = 'post' id = Column(Integer, primary_key=True) body = Column(String) user_id = Column(Integer, ForeignKey('user.id')) user = relationship('User') comments = relationship('Comment') class Comment(Base): __tablename__ = 'comment' id = Column(Integer, primary_key=True) body = Column(String) user_id = Column(Integer, ForeignKey('user.id')) post_id = Column(Integer, ForeignKey('post.id')) rating = Column(Integer) user = relationship('User') post = relationship('Post')
You can play around with the code here .
There are ORM classes, initialization of pure Alchemy (without my package) and filling with initial data.
Install Alchemy:
pip install sqlalchemy
Then save the test application to a file and run:
python .py
After Active Record ORM, I did not understand why I should write 3 lines to create an object
bob = User(name='Bobby', age=1) session.add(bob) session.flush()
instead of one?
bob = User.create(name='Bobby', age=1)
I understand that a manual flush session is needed so that requests to the database go in one bundle, and indeed the unit of work pattern gives many advantages in terms of performance.
But in real web applications, most of the tasks are trivial CRUD , and because not 3 requests will be made in the database, but one, the gain is not great. In any case, it is not worth the complexity of the code. And in general, not for nothing that the creators of Django, Ruby on Rails, Laravel, Yii chose Active Record ORM.
Well, nothing prevents implement Active Record over Data Mapper ! For this, all that’s necessary is that when the application is initialized, the session is passed to the model
BaseModel.set_session(session) # # BaseModel.session
Now ORM has access to the session, and you can implement the methods save, create, update, delete , etc.
bob = User.create(name='Bobby', age=1) bob.update(name='Bob', age=21) bob.delete()
In fact, on the lower layer of SQLAlchemy there is an update of fields from the array, but I want to have it at the ORM level.
Well, I also want to quickly create a request for a model
User.query # session.query(User)
and quickly get the first or all records
User.first() # session.query(User).first() User.all() # session.query(User).all()
or find the record by id, collapsing with an error if necessary
User.find(1) # session.query(User).get(1) User.find_or_fail(123987) # ,
As a result, we get a full-fledged Active Record as in my favorite Django, Laravel and Ruby on Rails, but under the hood we have a powerful Data Mapper. Thus, we have the best of both worlds .
I am not against the standard approach of Alchemy, but for simple tasks I want to have a simpler code (see the spoiler at the end of the article )
A detailed description and examples, see here .
To solve the problem of N + 1 requests, each ORM has its own solutions.
Suppose we display on page 10 users and all posts of each user. To avoid 11 queries (1 for users and 10 for posts), in SQLAlchemy these posts can be added
session.query(User).options(joinedload('posts'))
or download a separate request
session.query(User).options(subqueryload('posts'))
Well, great! Only now if you need to display comments with posts, and with the comments of their authors? Alchemy allows it , but in practice it turned out to be inconvenient.
I would like to declaratively define the relationships that we want to load:
User.with_({ 'posts': { 'comments': { 'user': JOINED } } }).all()
it is possible and without magic lines:
User.with_({ User.posts: { Post.comments: { Comment.user: JOINED } } }).all()
In addition, you can set a different download strategy: joinedload or subqueryload .
The following code will lead to 2 requests: for posts (with posts by the authors authors) and for comments (with posts by the authors authors comments):
Post.with_({ 'user': JOINED, # 'comments': (SUBQUERY, { # 'user': JOINED # }) }).all()
Well, for simple cases, when you need a joinload or subqueryload of a couple of relations, there is a simple syntax (the with_subquery
method with_subquery
out just like in Eloquent ):
Comment.with_joined('user', 'post', 'post.comments').all() # : # session.query(Comment).options(joinedload('user'), joinedload('post'), # joinedload('post.comments')).all() User.with_subquery('posts', 'posts.comments').all() # : # session.query(User).options(subqueryload('posts'), subqueryload('posts.comments')).all()
Note that you can load nested relationships like posts.comments
. This is not my feature, but Alchemy (unfortunately, I could not find the dock on this feature).
A detailed description with examples can be found here .
PS Special thanks to my colleagues for the code.
The first thing that struck me when studying Django was the magic operators in the filters :
Entry.objects.filter(headline__startswith="What")
and absolutely struck by the filtering links :
Entry.objects.filter(blog__name='Beatles Blog')
it is simpler than a more "correct" decision in Alchemy:
session.query(Entry).join(Entry.blog).filter(Blog.name=='Beatles Blog')
blog__name
write blogg__name
. Such strings, unlike class properties like Entry.blog
, IDE will not inspect.In addition to aesthetics, magic lines allow you to build queries dynamically (for example, passing filters with UI):
filters = {'entry__headline__contains': 'Lennon', 'entry__pub_date__year': 2008} # Blog.objects.filter(**filters)
This is especially useful in applications where the user can build arbitrary filters.
Alas, in Alchemy there is no possibility to build queries so dynamically. The maximum that it allows is a simple filtering of the type "column = value":
session.query(MyClass).filter_by(name = 'some name')
Taking a ready-made solution as a model (which was still not enough ), I made my own analogue of Jange and now I can filter declaratively:
Post.where(rating__in=[2, 3, 4], user___name__like='%Bi%').all()
The string user___name__like
parsed and we understand that you need to posttify the Post.user relationship and apply the filter User.name.like('...')
.
I.e
Post.where(user___name__like='%Bi%').all()
turns into
session.query(Post).join(Post.user).filter(User.name.like('%Bi%')).all()
In fact, it may happen that in the query some table will appear 2 times.
Suppose I want to get users who commented on posts by Vasya
User.where(posts___comments___user___name='Vasya').all()
It turns out that there is a user whom I request, and there is a comment author.
The problem is solved through alias , i. The final query will contain 2 tables: user
and user_1
.
Of course, we cannot know in advance whether the tables will be repeated, so we make each relation that the join has its own alias:
post_alias = User.posts.property.argument() # relationship session.query(User).outerjoin(post_alias) # ..
Here is a simplified analogue of the real code :
from sqlalchemy.orm import aliased from sqlalchemy.sql import operators # {'posts___comments___user___name__like': 'Vasya'}. : relations = ['posts', 'comments', 'user'] # 1. , ___ attr_name = 'name' # 2. , ___ op_name = 'like' # 3. , __ # op_name. # OPERATORS = {'like': operators.like_op}, # OPERATORS[op_name] operator = operators.like_op value = 'Vasya' cls = User # cls query = session.query(cls) # # last_alias = cls for relation in relations: relation = getattr(last_alias, relation) # relation, User.posts next_alias = aliased(relation.property.argument()) # (Post User.posts) # alias query = query.outerjoin(next_alias) # / last_alias = next_alias # SQL- / ( User) attr = getattr(last_alias, attr_name) # User.name query = query.filter(operator(attr, value)) # SQL-, User.name print(query.all())
Here is the ready-to-run code .
and sort:
Post.sort('-rating', 'user___name').all() # sort by rating DESC, user name ASC
Moreover, since we automatically join join relations, it is logical to indicate to SQLAlchemy that the specified relations are already fuzzy with the help of contains_eager . Now, if the Post.user
relationship Post.user
used in Post.user
/ sorting, then we can get the user without an additional query :
post = Post.sort('user___name').first() print(post.user) # , ..
A detailed description with examples, see here .
Here is an example of magic filtering and sorting in a real project .
Question to all, especially to those who dislike magic lines:
In the real world, you have to simultaneously filter, sort, and even eager to load connections.
Suppose we filter and sort posts by the same Post.user
relation. It may be that filtering and sorting 2 times pridoyunyat the same attitude, which will affect the speed.
If you just write
session.query(Post).join(Post.user).join(Post.user)
then, indeed, Alchemy will make only one join.
The thing is that we make our own alias for each relationship (see the spoiler " how it's done " above), and therefore Alchemy does not know that 2 alias on Post.user
is essentially the same thing, and follow up on your own.
Therefore, filtering, sorting and eager load (yes, it too) had to be done in one function in order to have information about all required joins (or rather, to have a single list of alias, see the spoiler " how it's done ") and make them only one time:
Comment.smart_query( filters={ 'post___public': True, 'user__isnull': False }, sort_attrs=['user___name', '-created_at'], schema={ 'post': { 'user': JOINED } }).all()
A detailed description with examples can be found here .
__repr__
As a developer, it's very important for me to play with the application in the console. And it was very unpleasant after the beauty in Laravel to see this:
>>> session.query(Post).all() [<myapp.models.Post object at 0x04287A50>, <myapp.models.Post object at 0x04287A90>]
This is terribly uninformative. and very furious . Therefore, I decided to display at least ID
>>> session.query(Post).all() [<Post #11>, <Post #12>]
from sqlalchemy import inspect ids = inspect(post).identity # primary
In general, each model has 1-2 attributes that reflect its essence. For example, the user is the name, and the post is the user and the body of the post. Therefore, I made it possible to set these attributes declaratively:
class User(BaseModel): __repr_attrs__ = ['name'] # ... class Post(BaseModel): __repr_attrs__ = ['user', 'body'] # body is just column, user is relationship # ...
Now we have a very convenient __repr__
:
>>> session.query(Post).all() [<Post #11 user:<User #1 'Bill'> body:'post 11'>, <Post #12 user:<User #2 'Bob'> body:'post 12'>]
Detailed description with examples here .
Implemented features, without which I personally worked very hard with Alchemy.
Now life has become easier, because:
__repr__
.On a combat project, this helped greatly simplify support and improve readability of the code.
All features, again, implemented here .
I have been working very closely with Alchemy for a year on a complex combat project. All the tasks that I have brought are not a whim, but a real necessity.
I love Alchemy and its standard approach. I like its flexibility, control over SQL, explicitness, etc. I consider Alchemy the best ORM that I have met. No Active Record can compare with it.
But the flexibility fee is a longer code. And in simple tasks like CRUD, the code of Alchemy is much more complex than that in Active Record ORM.
That's why I wrote my package, which can sometimes be used as an addition to pure Alchemy.
My package is just an add- on for Alchemy. I still love Alchemy for what it is and do not propose to abandon its original use.
About Active Record:
I understand the power of the unit of work pattern to optimize application performance. But in simple tasks, when you need to simply create an object in the controller, writing the same 3 lines for this (create an object, add to a session, flush sessions) does not inspire. That's why I did Active Record over, rather than instead of a Data Mapper for simple tasks.
About filtering / sorting in Django style:
I know that magic strings like user___name__like
are implicit and fraught with runtime errors. But sometimes it can be convenient.
I also prefer to write explicitly : session.query(Post).join(Post.user).filter(User.name.like(...))
, but it happens that you don’t know in advance which filter field to use, or there are filters so complex that it is obvious to write - there will be a lot of code and figs in it then you will understand. Try to write here such a filter and sorting on pure Alchemy. You write something, but the code will be a lot.
And most importantly, you have to write a bunch of complex code every time. And so you can declaratively set the necessary filters and sorting
['product___price', 'product___user___name']
and cut off everything elseThanks for attention!
Source: https://habr.com/ru/post/324876/
All Articles