📜 ⬆️ ⬇️

Again about ORM performance, or a new promising project - Pony ORM

In my first article on Habrahabr, I wrote about one of the main problems of existing ORM (Object-Relational-Mapping, object-relational mappings) - their performance. Considering and testing two of the most popular and well-known implementations of ORM in python, Django and SQLAlchemy, I came to the conclusion: The use of powerful universal ORMs results in very noticeable performance losses. In the case of using fast DBMS engines such as MySQL, the performance of data access is reduced by more than 3-5 times .

Recently, I was contacted by one of the developers of the new ORM engine called pony and asked me to share my thoughts on this engine. I thought that these considerations may be of interest to the Habrahabr community.


Brief summary


')
I again conducted some performance tests, similar to those described in the previous article, and compared their results with the results shown by the pony ORM. To measure performance in the conditions of a cached parameterized request, I had to modify the test of receiving an object so that each new request received an object with a new key.

Result: pony ORM exceeds the best results of django and SQLAlchemy by 1.5-3 times, even without caching objects.

Why pony was better



I confess right away: I did not manage to use standard tools for equalizing pony ORM with django and SQLAlchemy. This happened because if django can be used to cache only specific queries, and in SQLAlchemy, prepared parameterized queries (with some nontrivial efforts), then pony ORM caches all that is possible . View text pony ORM diagonally revealed: cached

- ready SQL query text of a specific DBMS
- structure compiled from the query text
- broadcast relationship
- connections
- created objects
- read and modified objects
- requests for deferred reading
- requests for creating, updating and deleting objects
- queries to search for objects
- blocking requests
- requests to navigate through relationships and their modifications
- maybe something else that I missed

This caching allows you to execute code that uses it as quickly as possible, without worrying about cunning tricks about improving performance like the one I invented and described in desperation in one of my previous articles.

Of course, sometimes caching brings some inconvenience . For example, caching objects does not allow one to easily compare the state of an object in memory with its image in a table — this is sometimes necessary for correct processing of data that are competitively processed in different processes. In one of the pony releases, I personally would like to see the parameters that allow you to disable certain types of caching for a piece of code at will .

Wishes



What do I still lack in pony ORM to fully compare it with other ORM?

- data migration is an absolutely necessary procedure for large projects using ORM
- adapters to some popular DBMS, for example MS SQL
- full abstraction from a variety of DBMS in code
- access to full object metadata
- customization of field types
- complete documentation

What do I lack in modern ORM, which could be implemented in the pony ORM, while this project has not yet grown to a state of stagnation?

- use of mixed filters (access to the fields and methods of the object simultaneously in the filter)
- computable fields and indices on them
- composite fields (stored in several fields of the table)
- field of the nested object (field, which is a regular python object)
- linking objects from different databases

And of course, I would like to see a complete framework for creating applications that uses the pony ORM as the basis for effective access to the database.

update 2013-08-03



If you want to get answers from the Pony ORM authors to your questions, you can contact them at the following addresses: alexander.kozlovsky@gmail.com and m.alexey@gmail.com. Invites are welcome.

Applications



Test results


>>> import test_native >>> test_native.test_native() get row by key: native req/seq: 3050.80815908 req time (ms): 0.327782 get value by key: native req/seq: 4956.05711955 req time (ms): 0.2017733 


 >>> import test_django >>> test_django.test_django() get object by key: django req/seq: 587.58369836 req time (ms): 1.7018852 get value by key: django req/seq: 779.4622303 req time (ms): 1.2829358 


 >>> import test_alchemy >>> test_alchemy.test_alchemy() get object by key: alchemy req/seq: 317.002465265 req time (ms): 3.1545496 get value by key: alchemy req/seq: 1827.75593609 req time (ms): 0.547119 


 >>> import test_pony >>> test_pony.test_pony() get object by key: pony req/seq: 1571.18299553 req time (ms): 0.6364631 get value by key: pony req/seq: 2916.85249448 req time (ms): 0.3428353 


Test code


test_native.py

 import datetime def test_native(): from django.db import connection, transaction cursor = connection.cursor() t1 = datetime.datetime.now() for i in range(10000): cursor.execute("select username,first_name,last_name,email,password,is_staff,is_active,is_superuser,last_login,date_joined from auth_user where id=%s limit 1" % (i+1)) f = cursor.fetchone() u = f[0] t2 = datetime.datetime.now() print "get row by key: native req/seq:",10000/(t2-t1).total_seconds(),'req time (ms):',(t2-t1).total_seconds()/10. t1 = datetime.datetime.now() for i in range(10000): cursor.execute("select username from auth_user where id=%s limit 1" % (i+1)) f = cursor.fetchone() u = f[0][0] t2 = datetime.datetime.now() print "get value by key: native req/seq:",10000/(t2-t1).total_seconds(),'req time (ms):',(t2-t1).total_seconds()/10. 


test_django.py

 import datetime from django.contrib.auth.models import User def test_django(): t1 = datetime.datetime.now() q = User.objects.all() for i in range(10000): u = q.get(id=i+1) t2 = datetime.datetime.now() print "get object by key: django req/seq:",10000/(t2-t1).total_seconds(),'req time (ms):',(t2-t1).total_seconds()/10. t1 = datetime.datetime.now() q = User.objects.all().values('username') for i in range(10000): u = q.get(id=i+1)['username'] t2 = datetime.datetime.now() print "get value by key: django req/seq:",10000/(t2-t1).total_seconds(),'req time (ms):',(t2-t1).total_seconds()/10. 


test_alchemy.py

 import datetime from sqlalchemy import * from sqlalchemy.orm.session import Session as ASession from sqlalchemy.ext.declarative import declarative_base query_cache = {} engine = create_engine('mysql://testorm:testorm@127.0.0.1/testorm', execution_options={'compiled_cache':query_cache}) session = ASession(bind=engine) Base = declarative_base(engine) class AUser(Base): __tablename__ = 'auth_user' id = Column(Integer, primary_key=True) username = Column(String(50)) password = Column(String(128)) last_login = Column(DateTime()) first_name = Column(String(30)) last_name = Column(String(30)) email = Column(String(30)) is_staff = Column(Boolean()) is_active = Column(Boolean()) date_joined = Column(DateTime()) def test_alchemy(): t1 = datetime.datetime.now() for i in range(10000): u = session.query(AUser).filter(AUser.id==i+1)[0] t2 = datetime.datetime.now() print "get object by key: alchemy req/seq:",10000/(t2-t1).total_seconds(),'req time (ms):',(t2-t1).total_seconds()/10. table = AUser.__table__ sel = select(['username'],from_obj=table,limit=1,whereclause=table.c.id==bindparam('ident')) t1 = datetime.datetime.now() for i in range(10000): u = sel.execute(ident=i+1).first()['username'] t2 = datetime.datetime.now() print "get value by key: alchemy req/seq:",10000/(t2-t1).total_seconds(),'req time (ms):',(t2-t1).total_seconds()/10. 


test_pony.py

 import datetime from datetime import date, time from pony import * from pony.orm import * db = Database('mysql', db='testorm', user='testorm', passwd='testorm') class PUser(db.Entity): _table_ = 'auth_user' id = PrimaryKey(int, auto=True) username = Required(str) password = Optional(str) last_login = Required(date) first_name = Optional(str) last_name = Optional(str) email = Optional(str) is_staff = Optional(bool) is_active = Optional(bool) date_joined = Optional(date) db.generate_mapping(create_tables=False) def test_pony(): t1 = datetime.datetime.now() with db_session: for i in range(10000): u = select(u for u in PUser if u.id==i+1)[:1][0] t2 = datetime.datetime.now() print "get object by key: pony req/seq:",10000/(t2-t1).total_seconds(),'req time (ms):',(t2-t1).total_seconds()/10. t1 = datetime.datetime.now() with db_session: for i in range(10000): u = select(u.username for u in PUser if u.id==i+1)[:1][0] t2 = datetime.datetime.now() print "get value by key: pony req/seq:",10000/(t2-t1).total_seconds(),'req time (ms):',(t2-t1).total_seconds()/10. 

Source: https://habr.com/ru/post/188842/


All Articles