Interesting moments of work Linq to Sql

More than a year has passed since my previous post on a similar topic. During this time, we somehow didn’t come close to the transition to the Entity Framework (according to the current legend, we’ll move on when a stable version of EF 7 appears), well, I have some experience that I would like to share. I think that this article will be interesting to those who, like us, still use this generally not bad, but forgotten Microsoft technology.

Dbtype

Specifying the DbType hint (with the exception of enum, see below) is not necessary for the properties of entities in Linq 2 Sql. And you should definitely not use the wrong DbType. For example, it is not necessary, if the column in the base is of the type nvarchar (50) , specify Linq 2 Sql that the column is of the type nchar (50) . And especially not to do so, if this field is a discriminator, as in the following example:

[Table(Name = "directcrm.OperationSteps")] [InheritanceMapping(Code = "", Type = typeof(OperationStep), IsDefault = true)] // ... [InheritanceMapping(Code = "ApplySegment", Type = typeof(ApplySegmentOperationStep))] public class OperationStep : INotifyPropertyChanging, INotifyPropertyChanged, IValidatable { //    ... [Column(Storage = "type", DbType = "nchar(50) NOT NULL", CanBeNull = false, IsDiscriminator = true)] public string Type { get { return type; } set { if ((type != value)) { SendPropertyChanging(); type = value; SendPropertyChanged(); } } } }

Well, let's try to read from the base an entity of type OperationStep and see if Inheritance Mapping can handle it.
')

Expected, no.

The Type property at first glance contains the correct value, but the type of the entity is incorrectly defined. What does Inheritance Mapping expect to see in the field in order to correctly match the type? Let's try OfType :

 modelContext.Repositories.Get<OperationStepRepository>().Items.OfType<ApplySegmentOperationStep>().FirstOrDefault();

And SQL generated by the linq provider:

 DECLARE @p0 NChar = 'ApplySegment '; SELECT TOP (1) [t0].[Type], [t0].[SegmentationSystemName], [t0].[SegmentSystemName], [t0].[Id], [t0].[Order], [t0].[OperationStepGroupId], [t0].[OperationStepTypeSystemName], [t0].[IsMarker] FROM [directcrm].[OperationSteps] AS [t0] WHERE [t0].[Type] = @p0;

The value of the parameter, in principle, was expected, but it was not very easy to detect such a bug. Be careful. It is clear that now the bug has manifested itself, but generally speaking, it can remain for a long time in the system, since entities will be correctly created and read from the database. As long as you do not notice the strange tail of the discriminator values or everything does not begin to fall after some script that updates the discriminators.

Now a few words about the storage of enums in linq to sql entities.

Linq to sql by default (if DbType is not specified) assumes that Enum has a column type of - Int . Accordingly, it will be impossible to work with the following entity (the Sex field in the directcrm.Customers table is of the nvarchar (15) type ):

  [Table(Name = "directcrm.Customers")] public sealed class Customer : INotifyPropertyChanging, INotifyPropertyChanged, IValidatable { //    [Column(Storage = "sex", CanBeNull = true)] public Sex? Sex { get { return sex; } set { if (sex != value) { SendPropertyChanging(); sex = value; SendPropertyChanged(); } } } }

If you try to subtract from the base, the Customer entity (in which the Sex field is filled with the string “female” ) will fall from System.InvalidCastException without any chance of understanding what exactly could not be achieved. When saving the consumer with the specified gender, we will receive the following request:

 DECLARE @p20 Int = 1 INSERT INTO [directcrm].[Customers](..., [Sex], ...) VALUES (..., @p7, ...)

Remarkably, to subtract such a tuple from the table just will not work - all the same silent System.InvalidCastException will fall. So if you store enum lines in the database using linq to sql, do not forget to specify DbType.
By the way, the Entity Framework is not able to store enums in rows, so in the project where we decided to use it, we had to use a hack: an additional getter for each enum field that the enum parsil itself (while the enum value is assumed stored in the property of type string).

Equality test

Linq to sql is able to map both the == operator and the object.Equals () call to SQL , however there are some differences in the mapping.

So, the query of the ActionTemplate entity is filtered by the SystemName field:

 var systemName = "SystemName"; var actionTemplate = modelContext.Repositories.Get<ActionTemplateRepository>() .GetActionTemplatesIncludingNonRoot() .FirstOrDefault(at => at.SystemName == systemName);

 DECLARE @p0 NVarChar(MAX) = 'SystemName'; SELECT TOP (1) [t0].[Discriminator], [t0].[Id], [t0].[CategoryId], [t0].[Name], [t0].[Type], [t0].[RowVersion], [t0].[SystemName], [t0].[CreationCondition], [t0].[UsageDescription], [t0].[StartDateTimeUtcOverride], [t0].[EndDateTimeUtcOverride], [t0].[DateCreatedUtc], [t0].[DateChangedUtc], [t0].[CreatedById], [t0].[LastChangedById], [t0].[CampaignId], [t0].[MailingTemplateName], [t0].[UseCustomParameters], [t0].[TargetActionTemplateId], [t0].[ParentActionTemplateId], [t0].[IsTransactional], [t0].[MailingStartTime], [t0].[MailingEndTime], [t0].[IgnoreStopLists], [t0].[ReversedActionTemplateId] FROM [directcrm].[ActionTemplates] AS [t0] WHERE [t0].[SystemName] = @p0

Nothing unusual. But what if systemName is null ?

 DECLARE @p0 NVarChar(MAX) = null; SELECT TOP (1) [t0].[Discriminator], [t0].[Id], [t0].[CategoryId], [t0].[Name], [t0].[Type], [t0].[RowVersion], [t0].[SystemName], [t0].[CreationCondition], [t0].[UsageDescription], [t0].[StartDateTimeUtcOverride], [t0].[EndDateTimeUtcOverride], [t0].[DateCreatedUtc], [t0].[DateChangedUtc], [t0].[CreatedById], [t0].[LastChangedById], [t0].[CampaignId], [t0].[MailingTemplateName], [t0].[UseCustomParameters], [t0].[TargetActionTemplateId], [t0].[ParentActionTemplateId], [t0].[IsTransactional], [t0].[MailingStartTime], [t0].[MailingEndTime], [t0].[IgnoreStopLists], [t0].[ReversedActionTemplateId] FROM [directcrm].[ActionTemplates] AS [t0] WHERE [t0].[SystemName] = @p0

Matter of course, so we will not achieve anything good. Let's try object.equals :

 string systemName = null; var actionTemplate = modelContext.Repositories.Get<ActionTemplateRepository>() .GetActionTemplatesIncludingNonRoot() .FirstOrDefault(at => object.Equals(at.SystemName, systemName));

 SELECT TOP (1) [t0].[Discriminator], [t0].[Id], [t0].[CategoryId], [t0].[Name], [t0].[Type], [t0].[RowVersion], [t0].[SystemName], [t0].[CreationCondition], [t0].[UsageDescription], [t0].[StartDateTimeUtcOverride], [t0].[EndDateTimeUtcOverride], [t0].[DateCreatedUtc], [t0].[DateChangedUtc], [t0].[CreatedById], [t0].[LastChangedById], [t0].[CampaignId], [t0].[MailingTemplateName], [t0].[UseCustomParameters], [t0].[TargetActionTemplateId], [t0].[ParentActionTemplateId], [t0].[IsTransactional], [t0].[MailingStartTime], [t0].[MailingEndTime], [t0].[IgnoreStopLists], [t0].[ReversedActionTemplateId] FROM [directcrm].[ActionTemplates] AS [t0] WHERE 0 = 1

Brilliant

WHERE 0 = 1

tells us that Linq to sql knows that ActionTemplate.SystemName cannot be null , so the query is useless. This sacred knowledge of Linq to sql is derived from the value of ColumnAttribute.CanBeNull. Unfortunately, from DbType he can not understand this.
If the request is made on a column that admits the absence of a value, then the translation will be expected:

 SELECT TOP (1) [t0].[Discriminator], [t0].[Id], [t0].[CategoryId], [t0].[Name], [t0].[Type], [t0].[RowVersion], [t0].[SystemName], [t0].[CreationCondition], [t0].[UsageDescription], [t0].[StartDateTimeUtcOverride], [t0].[EndDateTimeUtcOverride], [t0].[DateCreatedUtc], [t0].[DateChangedUtc], [t0].[CreatedById], [t0].[LastChangedById], [t0].[CampaignId], [t0].[MailingTemplateName], [t0].[UseCustomParameters], [t0].[TargetActionTemplateId], [t0].[ParentActionTemplateId], [t0].[IsTransactional], [t0].[MailingStartTime], [t0].[MailingEndTime], [t0].[IgnoreStopLists], [t0].[ReversedActionTemplateId] FROM [directcrm].[ActionTemplates] AS [t0] WHERE [t0].[SystemName] IS NULL

Therefore, it is probably necessary to try to use not the equality operator, but object.Equals , since it is translated more “qualitatively”.

LeftOuterJoin

As you know, Linq does not provide an extension method at all for connecting collections with the possibility of missing values in one of them. But sometimes when working with linq to sql, we need to get into the left outer join join in sql, for example, and in such situations we use combinations of linq methods that ultimately translate into a left outer join. I know two ways to get the left outer join:
First option:

 CustomerActions .GroupJoin(CustomerBalanceChanges, ca => ca, cbch => cbch.CustomerAction, (ca, cbchs) => cbchs .DefaultIfEmpty() .Select(cbch => new { ca, cbch })) .SelectMany(g => g) .Dump();

The second option:

 CustomerActions .SelectMany(ca => CustomerBalanceChanges .Where(cbch => cbch.CustomerAction == ca) .DefaultIfEmpty(), (ca, cbch) => new { ca, cbch}) .Dump();

Both variants are translated into a completely identical SQL - left outer join with a subquery and a test column (to determine if an entity exists from the right set):

 SELECT [t0].[Id], [t0].[IsTimeKnown], [t0].[BrandName], [t0].[ActionTemplateId], [t0].[CustomerId], [t0].[StaffId], [t0].[PointOfContactId], [t0].[OriginalCustomerId], [t0].[IsOriginalCustomerIdExact], [t0].[TransactionalId], [t0].[DateTimeUtc], [t0].[CreationDateTimeUtc], [t2].[test], [t2].[Id] AS [Id2], [t2].[ChangeAmount], [t2].[Comments], [t2].[CustomerActionId], [t2].[AdminSiteComments], [t2].[BalanceId] FROM [directcrm].[CustomerActions] AS [t0] LEFT OUTER JOIN ( SELECT 1 AS [test], [t1].[Id], [t1].[ChangeAmount], [t1].[Comments], [t1].[CustomerActionId], [t1].[AdminSiteComments], [t1].[BalanceId] FROM [promo].[CustomerBalanceChanges] AS [t1] ) AS [t2] ON [t0].[Id] = [t2].[CustomerActionId]

For reference: CustomerActions - consumer actions in the system, CustomerBalanceChanges - his changes in balance, by request we get changes in consumer balance with appropriate actions (or just an action if it was not a change in balance).

Let's complicate the request: now we want to receive not only changes in the balance of consumers, but also their prizes:

 CustomerActions .SelectMany(ca => CustomerBalanceChanges .Where(cbch => cbch.CustomerAction == ca) .DefaultIfEmpty(), (ca, cbch) => new { ca, cbch}) .SelectMany(cacbch => CustomerPrizes .Where(cp => cacbch.ca == cp.CustomerAction) .DefaultIfEmpty(), (cacbch, cp) => new { cacbch.ca, cacbch.cbch, cp}) .Dump();

 SELECT [t0].[Id], [t0].[IsTimeKnown], [t0].[BrandName], [t0].[ActionTemplateId], [t0].[CustomerId], [t0].[StaffId], [t0].[PointOfContactId], [t0].[OriginalCustomerId], [t0].[IsOriginalCustomerIdExact], [t0].[TransactionalId], [t0].[DateTimeUtc], [t0].[CreationDateTimeUtc], [t2].[test], [t2].[Id] AS [Id2], [t2].[ChangeAmount], [t2].[Comments], [t2].[CustomerActionId], [t2].[AdminSiteComments], [t2].[BalanceId], [t4].[test] AS [test2], [t4].[Id] AS [Id3], [t4].[PrizeId], [t4].[SaleFactId], [t4].[PromoMechanicsName], [t4].[WonCustomerPrizeId], [t4].[PrizeType], [t4].[Published], [t4].[PromoMechanicsScheduleItemId], [t4].[CustomerActionId] AS [CustomerActionId2] FROM [directcrm].[CustomerActions] AS [t0] LEFT OUTER JOIN ( SELECT 1 AS [test], [t1].[Id], [t1].[ChangeAmount], [t1].[Comments], [t1].[CustomerActionId], [t1].[AdminSiteComments], [t1].[BalanceId] FROM [promo].[CustomerBalanceChanges] AS [t1] ) AS [t2] ON [t2].[CustomerActionId] = [t0].[Id] LEFT OUTER JOIN ( SELECT 1 AS [test], [t3].[Id], [t3].[PrizeId], [t3].[SaleFactId], [t3].[PromoMechanicsName], [t3].[WonCustomerPrizeId], [t3].[PrizeType], [t3].[Published], [t3].[PromoMechanicsScheduleItemId], [t3].[CustomerActionId] FROM [promo].[CustomerPrizes] AS [t3] ) AS [t4] ON [t0].[Id] = [t4].[CustomerActionId]

Nothing out of the ordinary, just another left outer join was added, as expected. But generally speaking, we could build a query in a different way. For example, since we know that for each prize there is exactly a change in balance, we could write this:

 CustomerActions .SelectMany(ca => CustomerPrizes .Join(CustomerBalanceChanges, cp => cp.CustomerAction, cbch => cbch.CustomerAction, (cp, cbch) => new { cbch, cp }) .Where(cbchcp => cbchcp.cbch.CustomerAction == ca) .DefaultIfEmpty(), (ca, cbchcp) => new { cbchcp.cbch, cbchcp.cp, ca}) .Dump();

This will result in the following SQL:

 SELECT [t2].[Id], [t2].[ChangeAmount], [t2].[Comments], [t2].[CustomerActionId], [t2].[AdminSiteComments], [t2].[BalanceId], [t1].[Id] AS [Id2], [t1].[PrizeId], [t1].[SaleFactId], [t1].[PromoMechanicsName], [t1].[WonCustomerPrizeId], [t1].[PrizeType], [t1].[Published], [t1].[PromoMechanicsScheduleItemId], [t1].[CustomerActionId] AS [CustomerActionId2], [t0].[Id] AS [Id3], [t0].[IsTimeKnown], [t0].[BrandName], [t0].[ActionTemplateId], [t0].[CustomerId], [t0].[StaffId], [t0].[PointOfContactId], [t0].[OriginalCustomerId], [t0].[IsOriginalCustomerIdExact], [t0].[TransactionalId], [t0].[DateTimeUtc], [t0].[CreationDateTimeUtc] FROM [directcrm].[CustomerActions] AS [t0] LEFT OUTER JOIN ([promo].[CustomerPrizes] AS [t1] INNER JOIN [promo].[CustomerBalanceChanges] AS [t2] ON [t1].[CustomerActionId] = [t2].[CustomerActionId]) ON [t2].[CustomerActionId] = [t0].[Id]

Notice in this SQL that SELECT 1 as [test] disappeared to check for the presence of an entity. And this leads to the fact that such a request does not work, and ends with an InvalidOperationException: "A NULL value cannot be assigned to a member that is a System.Int32 type that does not allow NULL values.". Since linq no longer tracks its test-flag, it tries to honestly compose CustomerBalanceChange and CustomerPrize entities from columns whose values are NULL, but it cannot write NULL, for example, to CustomerBalanceChange.Id, which the text of the exception tells us.
What workarounds for this problem exist? Well, first, you can rephrase the query as it was written in the first case. But this is absolutely not a universal solution, because who said that it can always be done. Linq at the first complex query can fall in the same way, and you don’t feel like wasting time on rearranging the joines. And the second query is semantically different from the first.
Secondly, we could make a request not to entities, but to certain dto, for example like this:

 CustomerActions .SelectMany(ca => CustomerPrizes .Join(CustomerBalanceChanges, cp => cp.CustomerAction, cbch => cbch.CustomerAction, (cp, cbch) => new { cbch, cp }) .Where(cbchcp => cbchcp.cbch.CustomerAction == ca) .DefaultIfEmpty(), (ca, cbchcp) => new { cbchcp.cbch, cbchcp.cp, ca}) .Select(cacbchcp => new { CustomerActionId = cacbchcp.ca.Id, CustomerBalanceChangeId = (int?)cacbchcp.cbch.Id, CustomerPrizeId = (int?)cacbchcp.cp.Id, } )

Since CustomerBalanceChangeId and CustomerPrizeId are now nullable, there are no problems. But we may not be satisfied with such an approach, because we may need exactly entities (which we want to change, delete, or call functions on them). So there is a straightforward third join method, in which a check for null will be performed on the sql side:

 CustomerActions .SelectMany(ca => CustomerPrizes .Join(CustomerBalanceChanges, cp => cp.CustomerAction, cbch => cbch.CustomerAction, (cp, cbch) => new { cbch, cp }) .Where(cbchcp => cbchcp.cbch.CustomerAction == ca) .DefaultIfEmpty(), (ca, cbchcp) => new { cbch = cbchcp == null ? null : cbchcp.cbch, cp = cbchcp == null ? null : cbchcp.cp, ca }) .Dump();

This is not broadcast in such a terrible sql, as it may seem at first glance:

 SELECT (CASE WHEN [t3].[test] IS NULL THEN 1 ELSE 0 END) AS [value], [t3].[Id], [t3].[ChangeAmount], [t3].[Comments], [t3].[CustomerActionId], [t3].[AdminSiteComments], [t3].[BalanceId], [t3].[Id2], [t3].[PrizeId], [t3].[SaleFactId], [t3].[PromoMechanicsName], [t3].[WonCustomerPrizeId], [t3].[PrizeType], [t3].[Published], [t3].[PromoMechanicsScheduleItemId], [t3].[CustomerActionId2], [t0].[Id] AS [Id3], [t0].[IsTimeKnown], [t0].[BrandName], [t0].[ActionTemplateId], [t0].[CustomerId], [t0].[StaffId], [t0].[PointOfContactId], [t0].[OriginalCustomerId], [t0].[IsOriginalCustomerIdExact], [t0].[TransactionalId], [t0].[DateTimeUtc], [t0].[CreationDateTimeUtc] FROM [directcrm].[CustomerActions] AS [t0] LEFT OUTER JOIN ( SELECT 1 AS [test], [t2].[Id], [t2].[ChangeAmount], [t2].[Comments], [t2].[CustomerActionId], [t2].[AdminSiteComments], [t2].[BalanceId], [t1].[Id] AS [Id2], [t1].[PrizeId], [t1].[SaleFactId], [t1].[PromoMechanicsName], [t1].[WonCustomerPrizeId], [t1].[PrizeType], [t1].[Published], [t1].[PromoMechanicsScheduleItemId], [t1].[CustomerActionId] AS [CustomerActionId2] FROM [promo].[CustomerPrizes] AS [t1] INNER JOIN [promo].[CustomerBalanceChanges] AS [t2] ON [t1].[CustomerActionId] = [t2].[CustomerActionId] ) AS [t3] ON [t3].[CustomerActionId] = [t0].[Id]

But, as you can see, there is a nuance. The request was not very complicated, but linq to sql didn’t care, instead of just using [t3]. [Test] in the final sample, drew the CASE ... WHEN construct. There is nothing to worry about until the request has become too large. But if you try to combine tables 10 in this way, then the final SQL queries can reach several hundred kilobytes! Several hundred kilobytes of CASE ... WHEN statements.

In addition, constantly using any of the above-described constructions for a simple left-outer join is somewhat costly, it would be much easier to write the LeftOuterJoin extension-method yourself and use it in queries. Here is how such an extension looks here:

 public static IQueryable<TResult> LeftOuterJoin<TOuter, TInner, TKey, TResult>( this IQueryable<TOuter> outerValues, IQueryable<TInner> innerValues, Expression<Func<TOuter, TKey>> outerKeySelector, Expression<Func<TInner, TKey>> innerKeySelector, Expression<Func<TOuter, TInner, TResult>> fullResultSelector, Expression<Func<TOuter, TResult>> partialResultSelector) { Expression<Func<TOuter, IEnumerable<TInner>, IEnumerable<TResult>>> resultSelector = (outerValue, groupedInnerValues) => groupedInnerValues.DefaultIfEmpty().Select( innerValue => Equals(innerValue, default(TInner)) ? partialResultSelector.Evaluate(outerValue) : fullResultSelector.Evaluate(outerValue, innerValue)); return outerValues .GroupJoin(innerValues, outerKeySelector, innerKeySelector, resultSelector.ExpandExpressions()) .SelectMany(result => result); }

This extension is always broadcast, but it uses null checking on the sql side. The following uses are intended:

 var cbchcas = customerActions .LeftOuterJoin( context.Repositories .Get<CustomerBalanceChangeRepository>() .Items .Join(context.Repositories .Get<CustomerPrizeRepository>() .Items, cbch => cbch.CustomerAction, cp => cp.CustomerAction, (cbch, cp) => new { cbch, cp }), ca => ca, cbchcp => cbchcp.cbch.CustomerAction, (ca, cbchcp) => new { ca, cbchcp.cbch, cbchcp.cp }, ca => new { ca, cbch = (CustomerBalanceChange)null, cp = (CustomerPrize)null }) .ToArray();

 SELECT (CASE WHEN [t3].[test] IS NULL THEN 1 ELSE 0 END) AS [value], [t3].[Id], [t3].[CustomerActionId], [t3].[ChangeAmount], [t3].[Comments], [t3].[AdminSiteComments], [t3].[BalanceId], [t3].[PrizeType], [t3].[Id2], [t3].[PrizeId], [t3].[PromoMechanicsName] AS [PromoMechanicsSystemName], [t3].[Published], [t3].[PromoMechanicsScheduleItemId], [t3].[SaleFactId], [t3].[CustomerActionId2], [t3].[WonCustomerPrizeId], [t0].[Id] AS [Id3], [t0].[DateTimeUtc], [t0].[IsTimeKnown], [t0].[PointOfContactId], [t0].[BrandName] AS [BrandSystemName], [t0].[CreationDateTimeUtc], [t0].[ActionTemplateId], [t0].[CustomerId], [t0].[StaffId], [t0].[OriginalCustomerId], [t0].[IsOriginalCustomerIdExact], [t0].[TransactionalId] FROM [directcrm].[CustomerActions] AS [t0] LEFT OUTER JOIN ( SELECT 1 AS [test], [t1].[Id], [t1].[CustomerActionId], [t1].[ChangeAmount], [t1].[Comments], [t1].[AdminSiteComments], [t1].[BalanceId], [t2].[PrizeType], [t2].[Id] AS [Id2], [t2].[PrizeId], [t2].[PromoMechanicsName], [t2].[Published], [t2].[PromoMechanicsScheduleItemId], [t2].[SaleFactId], [t2].[CustomerActionId] AS [CustomerActionId2], [t2].[WonCustomerPrizeId] FROM [promo].[CustomerBalanceChanges] AS [t1] INNER JOIN [promo].[CustomerPrizes] AS [t2] ON [t1].[CustomerActionId] = [t2].[CustomerActionId] ) AS [t3] ON [t0].[Id] = [t3].[CustomerActionId]

You may have noticed that in the extension-method itself, the Evaluate and ExpandExpressions methods are used. These are the extension methods from our Mindbox.Expressions library. The ExpandExpressions method recursively traverses the entire expression tree on which it was called, recursively replacing Evaluate calls with the expression on which Evaluate was called. The ExpandExpressions method can be called both on Expression objects and on IQueryable, which is sometimes more convenient (for example, if the query is built in several places). The library also has a number of interesting functions for reflective work with the code. Perhaps the library will be useful for someone.

UPD. A small remark from a colleague:

> Therefore, apparently you need to try to use not the equality operator, but object.Equals, since it is translated more “qualitatively”.

Bad advice. This, of course, will solve the null comparison problem. But there are side effects:
You can write a comparison of the values of incompatible types and the compiler will not swear
if you compare two values, none of which can be precomputed by .NET (for example, two properties of an entity), you get bad and slow SQL (or they are equal, or both are null)
may not work for enum. There are mapping bugs - it does not always understand that the value needs to be converted to a string, not to an int. For the comparison operation, he understands, and for passing as a parameter to object.Equals, it seems we have encountered problems

Source: https://habr.com/ru/post/250759/

All Articles

Interesting moments of work Linq to Sql

Dbtype

Equality test

LeftOuterJoin

More articles: