⬆️ ⬇️

Don't shoot yourself in the foot using LINQ

In the article, I described several examples of unobvious moments when using LINQ to SQL. If you're a .NET guru, you might find it boring, the rest is welcome!

Let's start with this example. Suppose we have an entity “type of action”. The type of action has a human-readable name and a system name — some unique identifier by which we can work with objects of this entity from code. This is the structure in the form of objects in the code:



class ActionType { public int id; public string systemname; public string name; } 


 var ActionTypes = new ActionType[] { new ActionType { id = 1, systemname = "Registration", name = "" }, new ActionType { id = 2, systemname = "LogOn", name = "  " }, new ActionType { id = 3, systemname = null, name = "     " } }; 


For the same structure with similar data, a table was created in the database and auxiliary objects for using LINQ to SQL. Suppose we need to find out if we have an action type with the system name NotExistingActionType . The question is what will be displayed on the screen after following these instructions:



 var resultForObjects = ActionTypes.All(actionType => actionType.systemname != "NotExistingActionType"); var context = new LinqForHabr.DataClasses1DataContext(); var resultForLTS = context.ActionTypes.All(actionType => actionType.SystemName != "NotExistingActionType"); Console.WriteLine("Result for objects: " + resultForObjects + "\nResult for Linq to sql: " + resultForLTS); Console.ReadLine(); 


The answer in this case is strange at first sight. The result of the application will be:

Result for objects: True

Result for LINQ to sql: False
Why does “the same method” return different values ​​for the same data? The thing is that this is not the same method at all, but different methods with the same names. The first iterates over objects in memory, the second is converted to a SQL query that will be executed on the server and will return another result. Results differ due to the presence of one of our actions with an undefined system name. And at this moment, specific differences between the two runtimes appear : for .NET, the expression null! = ObjRef is true (unless of course objRef is not null ), and therefore the value of the expression “system names of all types of actions are not equal NotExistingActionType ” is true in our situation.

But LINQ to SQL expressions are converted to SQL and executed on the server, and in SQL the comparison with NULL works differently. Expression values NULL == Something, NULL! = Something and even NULL == NULL will always be false, this is the standard, therefore the expression "system names for all types of actions are not equal NotExistingActionType " and will not be true, as NULL! = 'Not ExistingActionType' - Lying.

')

Now consider another example. Suppose we have users with their current balance:



 class User { public int id; public int balance; public string name; } 


 var users = new User[] { new User { id = 1, name = "", balance = 0 }, new User { id = 2, name = "", balance = 0 } }; 


The question is, what should return the amount on an empty set of elements. For me, for example, the obvious value is 0, but here, too, is not so simple. Let's do something like this:



 var resultForObjects = users.Where(user => user.id < 0).Sum(user => user.balance); var context = new LinqForHabr.DataClasses1DataContext(); var resultForLTS = context.Users.Where(user => user.Id < 0).Sum(user => user.Balance); Console.WriteLine("Result for objects: " + resultForObjects + "\nResult for Linq to sql: " + resultForLTS); Console.WriteLine(context.ActionTypes.First().Name); Console.ReadLine(); 


The result will again be unusual. Generally speaking, we will not be able to execute these instructions in this form, since an exception will be raised when executing:
System.InvalidOperationException: "A NULL value cannot be assigned to a member that is a System.Int32 type that does not allow NULL values."


The reasons for what is happening again in the translation of our calls to SQL. For the usual IEnumerable extension, Sum returns 0 for an empty set, which is easy to verify without calculating the resultForLTS (or, finally, after reading this here msdn.microsoft.com/ru-ru/library/bb549046 ). However, the DBMS calculates the sum of the empty set as NULL (this is correct or not - the question is quite holivar, but now it is just a fact), and LINQ, trying to return null instead of an integer, immediately fails. To fix this place is extremely simple, but you need to keep your ears open:



 var resultForLTS = context.Users.Where(user => user.Id < 0).Sum(user => (int?)user.Balance) ?? 0; 


Here, the return value of the Sum function becomes not an int , but a nullable int (this can be achieved by explicitly indicating the type of generic), which allows LINQ to return null , but the operator ?? will turn this null into 0.



Well, the last example. Surprisingly, the translation in SQL gives us some syntactic sugar. Consider this example. Add a Location object, and users will now have a link to their city:



 class User { public int id; public int balance; public string name; public Location location; } class Location { public int id; public string Name; } 


We will not create any Location objects and modify users, of interest is the following code:



 var resultForObjects = users.Select(user => user.location == null ? "  " : user.location.Name == null ? "  " : user.location.Name) .First(); var context = new LinqForHabr.DataClasses1DataContext(); var resultForLTS = context.Users.Select(user => user.Location == null ? "  " : user.Location.name == null ? "  " : user.Location.name) .First(); 


In both cases, the result will be the string “Location not specified”, since it is not really indicated, but what will happen if you write like this:



 var resultForLTS = context.Users.Select(user => user.Location.name ?? "  "); 


You might think that this will not work, since there is an explicit NullReferenceException (no user has a Location object a priori, we did not create them, did not write them to the database), but do not forget that this code will not be run in the environment .NET, but will be translated to SQL and running the DBMS. In fact, the query that comes out of this code will look like this (LINQPad to help):



 SELECT COALESCE([t1].[name],@p0) AS [value] FROM [Users] AS [t0] LEFT OUTER JOIN [Locations] AS [t1] ON [t1].[Id] = [t0].[LocationId] 


This “trick” allows us not to write a wild number of ternary operators in LINQ queries.



Conclusion:



When we write code, we constantly rely on lower-level functions and we believe that these functions work correctly. It is great that there is such a way to reduce complexity, but you should always be aware of whether we understand well enough what a particular function that we use will do. And for this - RTFM!

Source: https://habr.com/ru/post/203836/



All Articles