📜 ⬆️ ⬇️

Entity Framework and performance, second attempt

In my first attempt to close the hole in the performance of the Entity Framework, I considered only materialization. But further in the process of work, as one would expect, I came across another, more weighty limitation. The insertion, modification and deletion of records also occur slowly. For 100 inserts, EF sends 100 insert requests to the database without attempting to group them.

In addition, one unpleasant error was discovered in one of the projects: EF version 5.0.0, when working with Oracle, in Clob / Xml the fields do not allow inserting lines of more than 2000 characters.

A component was created for the solution, which I called Context Items, with the following capabilities:

1) Bulk Insert (MS Sql): in tables that do not have Identity as the primary key, it is possible to insert using the Bulk Insert method, which is supported by the Ms Sql Server database. In the case of Identity, there is no way to reliably get back the keys generated by the base when inserting with the Bulk Insert, therefore, for tables with such keys, several usual Insert-queries are used in one query. This works significantly slower than the Bulk Insert, but still faster than through EF.
')
2) Sequenced Bulk Insert (MS Sql): An alternative to Identity is usually the Guid, this solves the insertion problem, but creates another problem - due to the longer key of the Join operation, they start to work slower, besides this the Guid is inconsistent and therefore Clustered indexes do not bring their advantages . As a solution to this problem, starting with MS Sql Server 2012, it is possible to use Sequence to create primary keys. This allows the use of integer sequential keys, which allows the use of Clustered indexes, similar to Identity, and at the same time allows the use of Bulk Insert for insertion. The component only supports acyclic Sequence with increment 1.

3) Bulk Update (MS Sql): such an operation does not exist in the database itself, the component implements it sequentially by performing the following 4 operations:
    a) A temporary table is created that has the same set of fields as the target table.
    b) Bulk insert data into a temporary table
    c) A Join-Update operation is performed that transfers data from the records of the temporary table to the records of the target table 
       having matching primary keys.
    d) The temporary table is deleted. 

Due to the fact that the operation is not atomic, it is desirable to execute it in a transaction.

4) Bulk Delete (MS Sql): as well as Bulk Update, this operation occurs in 4 steps:
    a) A temporary table is created that has a set of fields that matches the primary key of the target table
    b) Bulk Insert is made to the temporary key table for the records to be deleted.
    c) A Join-Delete operation is performed.
    d) The temporary table is deleted.

5) Materialization: The function migrated from the previous version; in addition, I added a test project to the repository, which includes a comparison of the materialization performance with the micro-ORM Dapper. Context Items performs this operation about 3-5% faster than Dapper, and 40% faster than EF, when using AsNoTracking, without this, EF is still several times slower.

6) Array-bound Insert, Update and Delete (Oracle): The Bulk Insert component implemented in ODP.Net does not take into account neither triggers nor constraints, nor even primary keys. Therefore, it is not suitable for inserting directly into a table. In addition, it does not support transactions. Of course, you could try using a temporary table as in the case of MSSql, but I decided to use the method that is recommended by Oracle itself. The method is called array-binding. In short, a request sent to the database looks like we insert one record, but we pass not a set of fields as parameters, but a set of field arrays, and thus insert, update or delete an array of records, rather than one record. You can read about the method here.

7) Equality members: For all entities, methods GetHashCode and Equals are generated that work with the primary key.

The Context Items component supports EF version 5.0.0 - 6.1.3, as well as Oracle databases and MS Sql Server.
Only the database-first approach is supported, there is a limitation - the names of the entities must match the names of the tables. Hands did not reach to fix this restriction.

The component is published on nuget.org, with contextitemsmssql and contextitemsoracle identifiers. You can read more about installation and use in the repository on GitHub.com: github.com/repinvv/ContextItems

Source: https://habr.com/ru/post/258723/


All Articles