Choosing an ORM Strategy (.NET)

One of the mistakes that developers (and I used to be among them) make is the statement that you must use exactly one ORM strategy for the application being created. In general, this is not true. You can (and should) tie the choice of strategy to a specific scenario, and be sure that you choose the right tools for a particular case.

Immediately, I note that 99.9% of the time you should not use ADO.NET directly. If you are still using dataReader.MoveNext - stop!

A lot of people do not like ORM per se. After listening to their argument, I agree with what Martin Fowler wrote in OrmHate :
')

The biggest disappointment from ORM is the overstated expectations .

We have to come to terms with the idea that ORMs are bad, ugly and overloaded . ORMs are designed to solve a problem and have many different approaches for this. But, before we look at these approaches, let's examine what kind of problem we are trying to solve?

Bridging the gap

If you have to load or insert data into SQL, you must display (“maze”) your .NET data types in SQL. For .NET, this means using ADO.NET to send SQL commands to the SQL server. Then we need to map the SQL types to .NET types. And here there are nuances - for example, dates in SQL differ from dates in .NET.

ADO.NET helps us with this, but leaves us the job of handling raw data sets and creating .NET objects. As a result, we get what we want - we work with .NET objects and types. And some code translates this into SQL queries and back.

ORMs are designed to solve this problem by adding layers of different abstractions over ADO.NET. But there are many strategies for this. Let's take a look at each of them and see which ones are better suited.

Entity-based relational mapping

In this mapping, database tables almost always correspond 1: 1 to entities in your system. When you add a property to an object, add a column to the table. Using such a method is built around loading an entity (or aggregate) by its identifier, managing this object and, possibly, related objects, and then storing this object into the database via the ORM.

ORM in this case provides a lot of functionality, for example:

Change tracking
Lazy loading (lazy-loading)
Preloading (eager fetching)
Cascade
Ensuring the uniqueness of objects ( Identity map )
Work with units of work ( Unit of work )

If I work with only one entity or aggregate at a time, then ORMs like NHibernate are very suitable for us. They use the specified configuration to track the loaded entities and automatically save changes during the commit commit transaction. And this is nice, because we don’t have to drag our own layer of work with data. NHibernate does all the dirty work for us.

As long as we load an object by Id for the sole purpose of changing it, it all works fine. This eliminates a large amount of code that I would need to create to monitor the addition of objects, their preservation, etc.

The downside of this approach is that ORM does not know whether you are going to just read objects, or load an entity to change it. We often see people stumbling when they don’t understand that change tracking is enabled by default and how it works.

If you want to load an entity to change it and save the changes (or create a new entity), this approach provides greater flexibility from including a level of data access in your infrastructure layer and allows your entity types to be relatively independent of their method of preservation. This independence does not mean that my C # model and data schema may diverge. On the contrary, this means that the data access layer does not penetrate into my object model, which I would rather prefer to load with business rules instead.

Result-set-based relational mapping

In most applications, the requirements for reading data significantly exceed the number of entries. We saw a 100: 1 ratio between SELECT and INSERT / UPDATE / DELETE in our recent application. When we look at what SQL is really good for, it’s about working with data in sets. To select a set of data from a SQL server, it often does not make any sense to try to directly display this data in essence.

But we still prefer not to work directly with IDataReader or DataTable. These are poorly-typed objects, heavily portable to the upper layers of the application. On the contrary, we often build objects adapted to the data. These objects are often called DTO (Data-Transfer Objects), or models for reading (Read Models). We create such DTOs for individual SQL samples — and rarely in order to reuse them in other queries.

Many ORMs have functionality optimized for such scenarios. In NHibernate, you can use projections to turn off tracking, and display data directly in the DTO. You can use SQL queries to do this and do not need a mapping configuration. Or you can use micro-ORM for example PetaPoco.

These reads can also generate DTO objects as they are read. Both NHibernate and several micro-ORMs allow you to receive individual DTO objects sequentially one by one while reading the rows of query results, thereby minimizing the amount of objects contained in memory.

In our applications, we still often use NHiberante for reading, but do not use entity objects, but instead use raw SQL. We rely on NHiberanate’s optimized mappers to simply submit the DTO type, and the result will be automatically displayed.
This approach does not work well if we need to apply business rules and keep the information back. Since these models are usually displayed in separate data sets, and not in database tables.

Active Record is another example of an intrinsic data mapping in which the functionality of working with data is included in the object model itself.

DML-based query mapping (DML-based relational mapping)

If you know what kind of SQL you need to implement CRUD, and would prefer to create it manually, then you are already looking for something to effectively abstract the DML commands to a level higher than ADO.NET.
And this is the micro-ORM arena. Such frameworks as PetaPoco , Dapper , Massive, and others are designed to help solve ADO.NET performance problems. They usually still allow us to work with ADO.NET objects, but our interaction is greatly simplified. We only need a connection, and these frameworks can allow working with all CRUD operations in a form that offers a much simpler code than ADO.NET itself.

In the case when you do not have entities and need to display them in tables and back, micro-ORM will give a much easier approach. Since micro-ORMs do not require preliminary configuration, they rely on lazy execution and optimized caching techniques to map SQL parameters and query results on the fly. Many applications can start with a DML based mappig, moving to a full ORM as soon as a relationship or entity requires it.

Bulk loading tools

This is what occupies a special place - sometimes you do not want to insert / load data in an object way. Instead, you would prefer to work with all sets entirely. Tools such as SQL Bulk Copy allow you to retrieve and upload data in CSV or in table formats,

These tools work roughly like a bazooka, pulling all the data back and forth right away, but providing nothing more than that. You cannot update or delete data, but in order to get large amounts of data from SQL, these utilities are what you need.

In many integration scenarios, where you provide data files to external partners, or vice versa, these loaders allow you to use files as tables and directly upload them to databases.

These utilities are much faster than traditional data parsing / loading methods. In some of our tests, we saw a difference in orders of magnitude compared with line loading. And in one case, we saw the difference between a few hours and a minute. The flip side of all this is that the functionality is limited only to INSERT and SELECT. All the rest requires other approaches.

The right tool for the task

In a recent project, I used each of the above approaches to work with one database and one code. NHibernate for intrinsic / aggregate mapping, sampling ready-made result-sets for reading data sets (and then preparing messages / export / views from results), DML-mapping for simple tables and modules, as well as bulk-load tools for downloading files from partners with many millions of lines.

The key point is that you do not have to attach yourself to a particular tool or approach. No ORM strategy works in all scenarios, and should not. NHibernate can work with many other scenarios (except for direct mapping of entities), but does not do anything. Difficulty often arises from attempts to use the same approach always.

Every application written outside of SQL Server uses ORM. Or this handwritten ADO.NET code, or NHibernate - you have to bridge the gap between .NET and SQL. This overcoming is a difficult task, and nothing solves the problem completely perfectly. And should not!

Choose an approach that solves a particular problem. Do not worry that you will have several ORM strategies in one project. This does not mean that unsystematic solutions are acceptable. But on the contrary, the use of verified solutions based on the knowledge of possible options is always a good idea.

Source: https://habr.com/ru/post/160845/

All Articles