Query translation to SQL using LinqToSql in tests

For several years now we have been doing our marketing automation product, and CI, or rather, a large number of automated tests, helps us to cut features with high speed.

There are approximately 700,000 lines of code with all customizations in the product, and we all have about 7,000 tests for this, and their number is constantly growing. Due to them, we are not afraid to perform large refactorings affecting many parts of the system. But, unfortunately, tests are not a panacea. Some tests may not be, some tests may be too expensive, and some situations are not reproduced in the test environment.

Virtually every transaction in our system is associated with working with MS SQL using LinqToSql. Yes, the technology is old, but it is quite difficult for us to migrate from it, and in business it suits us perfectly. Moreover, as I wrote earlier , we even have our own LinqToSql fork , where we fix its bugs and add some functionality.
')
In order to make queries to the database using LinqToSql, you need to use the IQueryable interface. At the time of receiving Enumerator or Execute from QueryProvider, the constructed expression tree is translated using SQL-methods to IQueryable to SQL, which is executed on SQL Server.

Since our business logic is strongly tied to entities in the database, our tests work a lot with the database. However, in 95% of tests we do not use a real database, since it is very time-consuming, and we are content with InMemoryDatabase. It is part of our test infrastructure, about which you can write a separate article, and in fact is just a Dictionary <Type, List> for each existing entity type. In tests, our UnitOfWork transparently works with such a database, giving access to EnumerableQueryable, which is simply obtained from any IEnumerable, calling AsQueryable () from it.

I will show an example of a test for understanding what is happening:

[TestMethod] public void () { var customer = new CustomerTestDataBuilder(TestDatabase).Build(); using (var modelContext = CreateModelContext()) { var filter = new SegmentFilter<Customer>(null, modelContext) { Segmentation = Controller.PeriodicalSegmentation, Segment = FilterValueWithPresence<Segment>.Concrete(Controller.PeriodicalSegment1) }; var result = modelContext.Repositories.Get<CustomerRepository>().GetFiltered(filter).ToList(); Assert.IsFalse(result.Contains(customer)); } }

In the test, we create a modelContext - our UnitOfWork, a wrapper over a DataContext with all sorts of buns, and then use it to get to the repository and filter out some segments. Of course, the repository does not know about any tests, just ModelContext works with InMemoryDatabase. The GetFiltered (filter) method forms some IQueryable, and then we materialize it.

There is a problem with this approach: we are not testing in any way that the IQueryable that we got from GetFiltered translates to SQL. As a result, we can get a bug on the production of something like this:

[NotSupportedException: Method 'Boolean DoesCurrentUserHaveSmsPermissionOnProject (Int32)' has no support.]
at System.Data.Linq.SqlClient.PostBindDotNetConverter.Visitor.VisitMethodCall (SqlMethodCall mc)
at System.Data.Linq.SqlClient.SqlVisitor.Visit (SqlNode node)
at System.Data.Linq.SqlClient.SqlVisitor.VisitExpression (SqlExpression exp)
at System.Data.Linq.SqlClient.SqlVisitor.VisitSelectCore (SqlSelect select)
at System.Data.Linq.SqlClient.PostBindDotNetConverter.Visitor.VisitSelect (SqlSelect select)
...

How to make sure that such bugs do not fall on the production? You can write tests with a real base, and we have them. They are slightly different from those that work with InMemoryDatabase, the test class just has a different parent. Here is an example:

 [TestMethod] public void () { Controller.CurrentDateTimeUtc = new DateTime(2016, 11, 1, 0, 0, 0, DateTimeKind.Utc); var sessionStartDateTime = Controller.CurrentDateTimeUtc.Value.AddHours(-1); using (var modelContext = CreateModelContext()) { var customer = new CustomerTestDataBuilder(modelContext).Build(); var activeSession = new CustomerSessionTestDataBuilder(modelContext) .WithLastCustomer(customer) .Active() .WithStartDateTimeUtc(sessionStartDateTime) .Build(); modelContext.SubmitTestData(); var newSession = new CustomerSession { PointOfContact = activeSession.PointOfContact, DeviceGuid = activeSession.DeviceGuid, IpAddress = activeSession.IpAddress, ScreenResolution = activeSession.IpAddress, IsAuthenticated = false }; newSession.SetStartDateTimeUtc(modelContext, Controller.CurrentDateTimeUtc.Value, customer); newSession.SetUserAgent(activeSession.UserAgent.UserAgentString, modelContext); newSession.SetLastCustomer(modelContext, customer, copyWebSiteVisitActions: false); modelContext.Repositories.Get<CustomerSessionRepository>().Add(newSession); modelContext.SubmitChanges(); Assert.IsNull(activeSession.IsActiveOrNull); Assert.IsNotNull(newSession.IsActiveOrNull); } }

In this test, everything happens in a real database, followed by a rollback of the Snapshot transaction, and cannot get through such errors. But, of course, there are not very many such tests, only about a hundred. The number does not come to any comparison with 7,000. And they cost much more time than usual ones.

The decision suggested itself: write your own implementation of IQueryable and, accordingly, IQueryProvider, decorating EnumberableQueryble and System.Data.Linq.DataQuery. Such an implementation should, when trying to get the result of a query using an enumerator, or by calling methods that immediately execute the query, such as Any, Count, Single, etc., first check whether the query can be translated to SQL, and if you can, just execute it over ordinary collections.

Now I will tell you exactly how this is implemented, and I will begin with a test that such a translation works in general:

 [TestMethod] public void () { var testEntity1 = new SomeTestEntityTestDataBuilder(TestDatabase).Build(); var testEntity2 = new SomeTestEntityTestDataBuilder(TestDatabase).Build(); using (var modelContext = CreateModelContext()) { var query = modelContext.Repositories.Get<SomeTestEntityRepository>().Items.Select(e => e.Id); var sqlQuery = query.ToString(); var expectedQuery = $"SELECT [t0].[{nameof(SomeTestEntity.Id)}]\r\n" + $"FROM [{SomeTestEntity.TableName}] AS [t0]"; Assert.AreEqual(expectedQuery, sqlQuery); var entities = query.ToList(); Assert.AreEqual(2, entities.Count); Assert.IsTrue(entities.Contains(testEntity1.Id)); Assert.IsTrue(entities.Contains(testEntity2.Id)); } }

This and several other tests were written to verify that SQL translation actually occurs and works correctly. Here are some more examples:

A couple of tests under the spoiler

 [TestMethod] public void EntityRefInheritanceMapping() { SomeAbstractTestEntity testEntity1 = new SomeTestEntityChildTestDataBuilder(TestDatabase).WithId(1).Build(); var testEntity2 = new SomeTestEntityTestDataBuilder(TestDatabase).WithId(2).Build(); var anotherTestEntity1 = new AnotherTestEntityTestDataBuilder(TestDatabase).WithLinkedEntity(testEntity1).Build(); var anotherTestEntity2 = new AnotherTestEntityTestDataBuilder(TestDatabase).WithId(3).Build(); using (var modelContext = CreateModelContext()) { var query = modelContext.Repositories.Get<AnotherTestEntityRepository>() .Items .Where(a => a.SomeTestEntity == testEntity1) .Select(a => a.Id); var entities = query.ToList(); Assert.AreEqual(1, entities.Count); Assert.IsTrue(entities.Contains(anotherTestEntity1.Id)); var sqlQuery = query.ToString(); var expectedQuery = $"SELECT [t0].[{nameof(AnotherTestEntity.Id)}]\r\n" + $"FROM [{AnotherTestEntity.TableName}] AS [t0]\r\n" + $"WHERE [t0].[{nameof(AnotherTestEntity.SomeTestEntityId)}] = @p0"; Assert.AreEqual(expectedQuery, sqlQuery); } }

 [TestMethod] public void SQL() { using (var modelContext = CreateModelContext()) { var query = modelContext.Repositories.Get<SomeTestEntityRepository>().Items.Where(e => e.ToString() == "asdf"); AssertException.Throws<InvalidOperationException>( () => query.ToList(), "ToStringOnlySupportedForPrimitiveTypes"); } }

As you can see in the last example, when you try to enumerate over IQueryable, which does not have translation in SQL, an exception occurs in the test.

We now turn directly to the implementation. We are interested in requests that occur within the model, that is, in fact, we are interested in any references to repositories. The repository for each entity has a certain set of business methods and gives access to IQueryable through the Items property, which is just a DataTable. Let's look at an example of using the Items property.

Base class for all repositories:

 public abstract class Repository<TEntity> : Repository { private ITable<TEntity> table; public IQueryable<TEntity> Items { get { return table; } } }

Example of using Items inside the repository

 public class CustomerRepository : ChangeRestrictedRepository<Customer, int, CustomerInitialState> { public List<Customer> GetCustomersByEmail(string email) { if (String.IsNullOrEmpty(email)) throw new ArgumentException("Email  .", nameof(email)); return Items.Where(user => user.Email == email).ToList(); } }

Example of using outside repository:

 FmcgPurchase = Add(ReverseSingleLinkedItemFilter<CustomerAction, FmcgPurchase>.GetFactory( "fmcgpurchase", modelContext => customerAction => modelContext .Repositories .Get<FmcgPurchaseRepository>() .Items .Where(fmcgPurchase => fmcgPurchase.CustomerAction == customerAction), canLinkedItemBeAbsent: true));

It turns out that you need to ensure that the Repository.Items returns our cunning IQueryable. Well, write our sly IQueryable :)

As was already seen above, Repository.Items actually returns ITable, and the table itself is initialized when creating UnitOfWork:

 public override void SetRepositoryRegistry(RepositoryRegistry repositories) { table = repositories.DatabaseContext.GetTable<TEntity>(); }

The DatabaseContext.GetTable () method is abstract. The DatabaseContext has 2 heirs: LinqDatabaseContext and InMemoryDatabaseContext. In LinqDatabaseContext, which is used when working with a real database, everything is simple: GetTable returns System.Data.Linq.Table. In InMemoryDatabase, the code is written like this:

 protected internal override ITable<T> GetTable<T>() { if (!tables.ContainsKey(typeof(T))) tables.Add( typeof(T), new StubTableImpl<T>( this, (InMemoryTable<T>)database.GetTable<T>(), linqToSqlTranslateHelperContext)); return (ITable<T>)tables[typeof(T)]; }

There is a bit of magic with the cache and linqToSqlTranslateHelperContext, which is not very clear yet, but it is already clear that the IQueryable we need, which we need to replace, is StubTableImpl, as well as the database.GetTable () call is used.
Let's start with database.GetTable (). Here the point is that StubTable is created when we access some repositories when the UnitOfWork has already been created. But in the test there can be a lot of UnitOfWork, and all of them should work with one base. Database is that database, and StubTable is just a way to gain access to this database.

Now let's take a closer look at the StubTableImpl class:

 public class StubTableImpl<T> : ITable<T>, IStubTable where T : class { internal StubTableImpl( InMemoryDatabaseContext databaseContext, InMemoryTable<T> inMemoryTable, DataContext linqToSqlTranslateHelperContext) { InnerTable = inMemoryTable; innerQueryable = new StubTableQueryable<T>( databaseContext, linqToSqlTranslateHelperContext.GetTable<T>()); } public Type ElementType { get { return innerQueryable.ElementType; } } public Expression Expression { get { return innerQueryable.Expression; } } public IQueryProvider Provider { get { return innerQueryable.Provider; } } Type IStubTable.EntityType { get { return typeof(T); } } public override string ToString() { return innerQueryable.Select(e => e).ToString(); } IEnumerable IStubTable.Items { get { return InnerTable; } } }

StubTableImpl implements IQueryable and IQueryProvider, delegating the entire implementation of StubTableQueryable innerQueryable. The StubTableQueryable itself looks like this:

 internal class StubTableQueryable<TEntity> : IOrderedQueryable<TEntity> { private readonly InMemoryDatabaseContext inMemoryContext; private readonly IQueryable<TEntity> dataContextQueryable; private readonly StubTableQueryProvider stubTableQueryProvider; public StubTableQueryable( InMemoryDatabaseContext inMemoryContext, IQueryable<TEntity> dataContextQueryable) { this.inMemoryContext = inMemoryContext; this.dataContextQueryable = dataContextQueryable; stubTableQueryProvider = new StubTableQueryProvider(inMemoryContext, dataContextQueryable); } public IEnumerator<TEntity> GetEnumerator() { inMemoryContext.CheckConvertionToSql(Expression); IEnumerable<TEntity> enumerable = new EnumerableQuery<TEntity>(inMemoryContext.ConvertDataContextExpressionToInMemory(Expression)); return enumerable.GetEnumerator(); } IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); } public Expression Expression { get { return dataContextQueryable.Expression; } } public Type ElementType { get { return dataContextQueryable.ElementType; } } public IQueryProvider Provider { get { return stubTableQueryProvider; } } public override string ToString() { return inMemoryContext.GetQueryText(Expression); } }

I will cite the StubTableQueryProvider code right away, because they are very interconnected (now it even seems that it would be reasonable for it to be one class):

 internal class StubTableQueryProvider : IQueryProvider { private static readonly IQueryProvider enumerableQueryProvider = Array.Empty<object>().AsQueryable().Provider; private readonly InMemoryDatabaseContext inMemoryContext; private readonly IQueryable dataContextQueryable; public StubTableQueryProvider( InMemoryDatabaseContext inMemoryContext, IQueryable dataContextQueryable) { this.inMemoryContext = inMemoryContext; this.dataContextQueryable = dataContextQueryable; } public IQueryable<TElement> CreateQuery<TElement>(Expression expression) { return new StubTableQueryable<TElement>( inMemoryContext, dataContextQueryable.Provider.CreateQuery<TElement>(expression)); } public object Execute(Expression expression) { inMemoryContext.CheckConvertionToSql(expression); return enumerableQueryProvider.Execute(inMemoryContext.ConvertDataContextExpressionToInMemory(expression)); } public TResult Execute<TResult>(Expression expression) { inMemoryContext.CheckConvertionToSql(expression); return enumerableQueryProvider.Execute<TResult>(inMemoryContext.ConvertDataContextExpressionToInMemory(expression)); } }

Here it is necessary to explain how the construction of expression trees in general works using the extension methods on IQueryable in System.Linq.
These methods themselves are defined in the static class Queryable. Here is a piece of this class to understand what is happening:

 public static class Queryable { public static IQueryable<TSource> Where<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate) { return source.Provider.CreateQuery<TSource>( Expression.Call( null, GetMethodInfo(Queryable.Where, source, predicate), new Expression[] { source.Expression, Expression.Quote(predicate) } )); } public static int Count<TSource>(this IQueryable<TSource> source) { return source.Provider.Execute<int>( Expression.Call( null, GetMethodInfo(Queryable.Count, source), new Expression[] { source.Expression } )); } }

I gave here examples of the implementation of two methods: Where and Count. My choice fell on them, because they show different ways of interaction between the IQueryable and IQueryProvider interfaces.
Let's look first at the implementation of the Where method. This method takes IQueryable and filter condition, and returns IQueryable. With this, you can easily notice that this method does not filter anything. All he does is create an expression tree: pull an expression tree out of an incoming IQueryable, add a call to the Where method to it, that is, itself, with the filter condition parameter. After that, the resulting new expression tree is passed to IQueryProvider.CreateQuery, which is needed just to wrap the Expression in IQueryable.

Let's try to make out by example. Suppose we have this code:
Customers.Where (c => c.Sex == Sex.Male)
In this case, Customers - Table. Then the IQueryable will be passed to the Where method, in which there will be an Expression - Table. After that, Where will add itself to the end of this Expression with the condition we passed. It turns out Table.Where (c => c.Sex == Sex.Male). Next, this Expression is wrapped back into IQueryable and returned from the method. There are no calls to the database, just a call to a pure function.

Now look at the Count method. It calculates the number of items in the requested collection immediately upon accessing it. This is done by calling the IQueryProvider.Execute method. This method takes an Expression, based on which it must build a query, and returns the result of this query — a quantity. Building an Expression here is similar to the Where method: the original IQueryable is taken, it turns out an Expression and is completed with a Count. Thus, IQueryProvider.Execute should bypass this Expression, understand what is required of it and make the corresponding query to the database.

Now, armed with new knowledge, let's return to StubTableQueryable and StubTableQueryProvider. Now we roughly understand what we want from them: when calling the StubTableQueryable.GetEnumerator and StubTableQueryProvider.Execute methods, we have to take our Expression or IQueryable, try to translate it into SQL using some DataContext, and then retrieve the data just from memory. To do this, StubTableQueryProvider.Execute and StubTableQueryable.GetEnumerator written similar code, which initially causes CheckConvertionToSql, and then use ConvertDataContextExpressionToInMemory converts source Expression and either executes it with the help EnumerableQueryble, or cause Enumerator in EnumerableQueryble with transformed Expression'om.

To begin with, how to check that the query is actually translated into SQL. The CheckConversionToSql method tries to get the query text by its Expression, for which it uses DataContext.GetCommand. A small problem is that GetCommand accepts IQueryable, and we have Expression, but it doesn't matter, in fact, all he needs is Expression :)

As a result, the code that checks that the query is translated to SQL looks like this:

 public string GetQueryText(Expression expression) { return queryExpressionToQueryText.GetOrAdd( expression.ToString(), expressionText => { var fakeQueryable = new FakeQueryable(expression); var result = linqToSqlTranslateHelperContext.GetCommand(fakeQueryable); return result.CommandText; }); }

The FakeQueryable class is just needed as an adapter, here’s its implementation:

 public class FakeQueryable : IQueryable { public FakeQueryable(Expression expression) { Expression = expression; } public IEnumerator GetEnumerator() { throw new NotSupportedException(); } public Expression Expression { get; } public Type ElementType { get { throw new NotSupportedException(); } } public IQueryProvider Provider { get { throw new NotSupportedException(); } } }

It would be more correct to correct Mindbox.Data.Linq so that the GetCommand (Expression) overload exists, but so far this has not been done.

The linqToSqlTranslateHelperContext used above is the DataContext instance that is used not only to call GetCommand on it, but also to get a Table from it associated with the database. The initial query is built against these Tables. If we try to actually execute such a request, we will get an exception that there is no connection for this DataContext, because Connection is not needed to translate requests, but is needed to fulfill them.
However, we still need to receive data from this Expression. To do this, you have to convert a little, for which you use ConvertDataContextExpressionToInMemory.

Usually, to do something with Expression, you need to inherit from ExpressionVisitor, where for each type of expression there is a method that you can override, and write your own logic there. To replace the LinqToSql tables with the InMemoryDatabase tables in the Expression, we did. Here is this Visitor:

 public class ConstantObjectReplaceExpressionVisitor<T> : ExpressionVisitor where T : class { private readonly Dictionary<T, T> replacementDictionary; public ConstantObjectReplaceExpressionVisitor(Dictionary<T, T> replacementDictionary) { this.replacementDictionary = replacementDictionary; } protected override Expression VisitConstant(ConstantExpression node) { var value = node.Value as T; if (value == null) return base.VisitConstant(node); if (!replacementDictionary.ContainsKey(value)) return base.VisitConstant(node); return Expression.Constant(replacementDictionary[value]); } public Expression ReplaceConstants(Expression sourceExpression) { return Visit(sourceExpression); } }

The meaning of this Visitor is to replace one constant with another. What to replace is transmitted in the constructor. All logic is written in VisitConstant and fairly straightforward.
Let's look at creating an instance of this Visitor:

 private ConstantObjectReplaceExpressionVisitor<IQueryable> CreateTableReplaceVisitor(DataContext dataContext) { var dataContextTableToInMemoryTableMap = new Dictionary<IQueryable, IQueryable>(); var entityTypes = ModelApplicationHostController.Instance .ModelConfiguration .DatabaseModel .GetRepositoriesByEntity() .Keys; foreach (var entityType in entityTypes) { var dataContextTable = dataContextGetTableFunc(dataContext, entityType); if (dataContextTable == null) throw new InvalidOperationException($"  {entityType}      DataContext'"); var inMemoryContextTable = GetInMemoryTable(database, entityType); if (inMemoryContextTable == null) throw new InvalidOperationException($"  {entityType}    InMemory "); dataContextTableToInMemoryTableMap.Add(dataContextTable, inMemoryContextTable); } return new ConstantObjectReplaceExpressionVisitor<IQueryable>(dataContextTableToInMemoryTableMap); }

Here we go through all types of entities that are registered, and for each type we get the Table from the DataContext - this will be the Key in the final Dictionary, as well as the InMemoryTable - this will be the Value. As a result, the resulting Visitor will replace all ContantExpression, whose Value is present in the keys of the dictionary transferred to it and correspond to a Table of some of our entities, in InMemoryTable.

It may seem that with such a passage there will be problems with expression trees, where we use not the constant value Table, but an expression whose value is Table. The following test was written for this case:

 [TestMethod] public void () { var testEntity1 = new SomeTestEntityTestDataBuilder(TestDatabase).WithId(1).Build(); var testEntity2 = new SomeTestEntityTestDataBuilder(TestDatabase).WithId(2).Build(); var anotherTestEntity1 = new AnotherTestEntityTestDataBuilder(TestDatabase).WithId(testEntity1.Id).Build(); var anotherTestEntity2 = new AnotherTestEntityTestDataBuilder(TestDatabase).WithId(3).Build(); using (var modelContext = CreateModelContext()) { var query = modelContext.Repositories.Get<SomeTestEntityRepository>().Items .SelectMany(e => modelContext.Repositories.Get<AnotherTestEntityRepository>() .Items .Where(a => a.Id == e.Id)) .Select(a => a.Id); var entities = query.ToList(); Assert.AreEqual(1, entities.Count); Assert.IsTrue(entities.Contains(anotherTestEntity1.Id)); var sqlQuery = query.ToString(); var expectedQuery = $"SELECT [t1].[{nameof(AnotherTestEntity.Id)}]\r\n" + $"FROM [{SomeTestEntity.TableName}] AS [t0], [{AnotherTestEntity.TableName}] AS [t1]\r\n" + $"WHERE [t1].[{nameof(AnotherTestEntity.Id)}] = [t0].[{nameof(SomeTestEntity.Id)}]"; Assert.AreEqual(expectedQuery, sqlQuery); } }

Here modelContext.Repositories.Get (). Items is part of the expression tree and will not be replaced by our Visitor. Why, then, does such a test pass? How the request is correctly transmitted and how does enumeration occur?
Translating a query in such a situation should not be surprising, since LinqToSql during the broadcast of a query bypasses the expression tree, executing expressions in it that are actual constants. All calls to C # methods will be called before the current translation, if they are not used in a context that requires execution on the SQL server. That is why you can write modelContext.Repositories.Get (). Items.Where (a => a.TestNumber == GetSomeTestNumber ()) in the request, but you cannot write modelContext.Repositories.Get (). Items.Where (a => a .TestNumber == GetSomeTestNumber (a)). Because in the first case, the result of GetSomeTestNumber () will be calculated at the broadcasting stage and substituted into the query, and in the second, GetSomeTestNumber takes the entity that the request is in, that is, depends on the entity, which means it must also be translated. In the test, modelContext.Repositories.Get (). Items will be executed at the broadcasting stage, and Items from any repository will return StubTableImpl, the Expression of which is the Table. For particularly inquisitive I give a link to the code that does what I described above.
As for the direct execution of the request, it is still easier here. After replacing the first and only Table in the Expression of the original query, it begins to run as a normal Enumerable. And SelectMany just performs its part of the expression as a delegate. As part of this implementation, we will try to translate the subquery into SQL, which we will, of course, replace the Table with InMemoryTable in it and perform exactly the same.

What problems does this solution have? The main problem is that there are still errors in the mapping that will not be detected in this way. The fact that the request from IQueryable is translated to SQL does not mean that the materialization phase, when LinqToSql reads a stream with data and creates objects from them, will pass successfully. For example, in this phase errors may occur related to attempts to write a Null-valued property of entities that cannot contain a Null. , , , : 2 .
, , .

That's all. :)

Source: https://habr.com/ru/post/322604/

All Articles

Query translation to SQL using LinqToSql in tests

More articles: