A library that helps bridge the conceptual gap between OOP and DB during testing using ORM - LinqTestable

As you know, there is a conceptual gap between the object-oriented and relational model, which even ORM cannot overcome. Basically, this gap affects the fact that when using a relational database, we are forced to work on sets, rather than on specific objects. But there is another factor: the behavior of NULL in a database is different from the behavior of NULL in object-oriented languages. This can be a problem when you use the same query in two situations: 1) when requesting a database; 2) during unit testing, when an in-memory array is used instead of a database table. Moreover, this can be a problem if you only apply to the database, but think about NULL in terms of OOP, not a relational database!

Example 1

There are three tables related by foreign key: car, door, door handle. All foreign keys are not nullable, i.e. at each door and at the handle it has to be specified on what exactly they fasten (the specific car or a door).

The source code for creating tables

(Oracle was used as DB, ORM - EntityFramework, language - C #.)
')

create table CAR ( CAR_ID NUMBER(10) not null ); alter table CAR add constraint CAR_PK primary key (CAR_ID); create table DOOR ( DOOR_ID NUMBER(10) not null, CAR_ID NUMBER(10) not null ); alter table DOOR add constraint DOOR_PK primary key (DOOR_ID); alter table DOOR add constraint DOOR_CAR_FK foreign key (CAR_ID) references CAR (CAR_ID); create index DOOR_CAR_ID_I on DOOR (CAR_ID) tablespace INDX_S; create table DOOR_HANDLE ( DOOR_HANDLE_ID NUMBER(10) not null, DOOR_ID NUMBER(10) not null, COLOR NVARCHAR2(15) null ); alter table DOOR_HANDLE add constraint DOOR_HANDLE_PK primary key (DOOR_HANDLE_ID); alter table DOOR_HANDLE add constraint DOOR_HANDLE_DOOR_FK foreign key (DOOR_ID) references DOOR (DOOR_ID); create index DOOR_HANDLE_DOOR_ID_I on DOOR_HANDLE (DOOR_ID) tablespace INDX_S;

Create one machine in the database, the rest of the tables remain empty. Then just make the left join between the machine and the doors using the ORM:

 var cars = (from car in dataModel.CAR join door in dataModel.DOOR on car.CAR_ID equals door.CAR_ID into joinedDoor from door in joinedDoor.DefaultIfEmpty() //left join select new { car.CAR_ID, door.DOOR_ID }).ToList();

What do you think will return this request?

That's right, ORM will throw an exception in you and send it to the forest. Why? Bd will return the string
CAR_ID = 1, DOOR_ID = NULL, and ORM cannot process it, because both in the database and in the mapping it is indicated that door.DOOR_ID cannot be NULL. NULL also appeared solely because of the left join. Maybe the “curve” of the ORM is to blame? No, the ORM behavior is quite correct: to replace null with 0 or return an empty string means to deceive the user. Changing the mapping is also not an option: the code will say that the field can not be filled, and the business logic will require the opposite. The solution is to change the request so that ORM can understand that the field can be null:

 var cars = (from car in dataModel.CAR join door in dataModel.DOOR on car.CAR_ID equals door.CAR_ID into joinedDoor from door in joinedDoor.DefaultIfEmpty() select new { car.CAR_ID, DOOR_ID = door != null ? door.DOOR_ID : (int?) null }).ToList();

You can change the request manually, or you can write code that will change all such requests automatically in runtime (but more on that later).

Example 2

There is a request with two left join.

 var carsWithoutRedHandle = (from car in dataModel.CAR join door in dataModel.DOOR on car.CAR_ID equals door.CAR_ID into joinedDoor from door in joinedDoor.DefaultIfEmpty() join doorHandle in dataModel.DOOR_HANDLE on door.DOOR_ID equals doorHandle.DOOR_ID into joinedDoorHandle from doorHandle in joinedDoorHandle.DefaultIfEmpty() where doorHandle.Color != “RED” || doorHandle == null select car).ToList();

This request will be perfectly processed when you access the database. But it is worth using it in the unit test, and you will get a NullReferenceException when you try to access the door.DOOR_ID in the second join if any of the machines do not need the doors because of the open top. Well, it's time to change the query:

 var carsWithoutRedHandle = (from car in dataModel.CAR join door in dataModel.DOOR on car.CAR_ID equals door.CAR_ID into joinedDoor from door in joinedDoor.DefaultIfEmpty() join doorHandle in dataModel.DOOR_HANDLE on (door != null ? door.DOOR_ID : (int?)null) equals doorHandle.DOOR_ID into joinedDoorHandle from doorHandle in joinedDoorHandle.DefaultIfEmpty() where doorHandle.Color != “RED” || doorHandle == null select car).ToList();

However, there is one "but." By changing the linq query, you can get a sql query with a much slower execution plan. Let's look at such a case with an example.

 using System.Linq; using System.Linq.Expressions; using LinqKit; IEnumerable<CAR> GetCars(IDataModel dataModel, Expression<Func<DOOR, bool>> doorSpecification = null, Expression<Func<DOOR_HANDLE, bool>> doorHandleSpecification = null) { if (doorSpecification == null) doorSpecification = door => true; if (doorHandleSpecification == null) doorHandleSpecification = handle => true; var cars = (from car in dataModel.CAR.AsExpandable() join door in dataModel.DOOR on car.CAR_ID equals door.CAR_ID into joinedDoor from door in joinedDoor.DefaultIfEmpty() join doorHandle in dataModel.DOOR_HANDLE on /*(door != null ? door.DOOR_ID : (int?)null)*/door.DOOR_ID equals doorHandle.DOOR_ID into joinedDoorHandle from doorHandle in joinedDoorHandle.DefaultIfEmpty() where doorSpecification.Invoke(door) && doorHandleSpecification.Invoke(doorHandle) select car); return cars; } var carsWithRedHandle = GetCars(dataModel, doorHandleSpecification: doorHandle => doorHandle.COLOR == "RED").ToList();

Here is the sql query and its execution plan, when the join happens like this: door.DOOR_ID equals doorHandle.DOOR_ID

And here is the execution plan when (door! = Null? Door.DOOR_ID: (int?) Null) equals doorHandle.DOOR_ID

As you can see, the execution plan is completely different, and its Cost is one and a half times larger.
To solve this problem, you can use #if DEBUG and run tests in debug, but, believe me, the readability and reliability of the code will not increase at all from this. It is much better to deal with a problem in the bud - to make it so that when writing unit tests you don’t have to worry about this feature of left join. To this end, I have written a library laid out on https://github.com/FiresShadow/LinqTestable .
In order to use the library, you need to download and connect the project, and change the MockObjectSet in your project, namely, to replace this piece:

  public System.Linq.Expressions.Expression Expression { get { return _collection.AsQueryable<T>().Expression; } } public IQueryProvider Provider { get { return _collection.AsQueryable<T>().Provider; } }

on:

 public System.Linq.Expressions.Expression Expression { get { return _collection.AsQueryable<T>().ToTestable().Expression; } } public IQueryProvider Provider { get { return _collection.AsQueryable<T>().ToTestable().Provider; } }

After that, the above problem in unit tests will disappear by itself.

By the way, read how to write unit tests for the Entity Framework here .

The library is a bit damp and solves only one problem: NullReferenceException with two left join. Solving this problem alone does not eliminate the conceptual gap, there are many other problems, for example: comparing null with null for equality gives different results in relational and object-oriented models. But this problem is also solved.

Source: https://habr.com/ru/post/269917/

All Articles

A library that helps bridge the conceptual gap between OOP and DB during testing using ORM - LinqTestable

Example 1

Example 2

More articles: