📜 ⬆️ ⬇️

Direct SQL in EntityFramework. Now with strong typing

Hello!


Today we will talk a little about EntityFramework. A little bit. Yes, I know that it can be treated differently, many people spit from him, but in the absence of a better alternative, they continue to use it.


So here. Do you often use direct SQL queries to the database in your C # project with a customized ORM? Oh, yes, quit, do not shun. Use. Otherwise, how would you implement the deletion / update of the entities in batches and remained alive ...


What do we love most about live SQL? Speed ​​and simplicity. Where "in the best traditions of ORM" it is necessary to unload the wagon of objects into memory and make everyone context.Remove (well, or manipulate Attach), you can do with one small SQL query.
What do we most dislike about live SQL? Right. Lack of typification and explosiveness. Direct SQL is usually done through DbContext.Database.ExecuteSqlCommand , and it accepts only a string as input. Therefore, Find Usages in the studio will never show you which fields your direct SQL has touched on which entities, and, among other things, you have to rely on your memory in the question of the exact names of all the tables / columns you are feeling. And also to pray that no head shooter digs into your model and will not rename everything in the course of refactoring or by means of EntityFramework while you are sleeping.


So rejoice, adherents of small raw SQL queries! In this article, I will show you how to combine them with EF, not to lose in mining safety and not to produce detonators. Dive under the cat soon!


And what exactly do we want to achieve?


So, in this article, I will show you a great approach that will once and for all save you from worrying about the problems that direct SQL usually causes in tandem with EntityFramework. Your requests will acquire a human appearance, will be located through Find Usages and will become resistant to refactoring (deletion / renaming of fields in entities), and your feet will get warmer, the sores will dissolve, karma will clear .


We will need: C # 6.0 (well, the one in which the interpolation of strings is implemented), lambda expressions and a few straight hands. I called this technique "SQL Stroke". Ultimately, we will write several extension-methods for DbContext , which allow sending to the SQL database with strongly typed inserts. To do this, we need to talk with the metadata of EntityFramework, parse the lambda expressions and fix all the bugs and corner cases that arise along the way.


Here is what your direct SQL will look like after reading this article:


 using (var dc = new MyDbContext()) { //---------- dc.Stroke<Order>(x => $"DELETE FROM {x} WHERE {x.Subtotal} = 0"); // ^ IntelliSense! //---------- var old = DateTime.Today.AddDays(-30); dc.Stroke<Customer>(x => $"UPDATE {x} SET {x.IsActive} = 0 WHERE {x.RegisterDate} < {old}"); //---------- dc.Stroke<Item, Order>((i, o) => $@" UPDATE {i} SET {i.Name} = '[FREE] ' + {i.Name} FROM {i} INNER JOIN {o} ON {i.OrderId} = {o.Id} WHERE {o.Subtotal} = 0" , true); } 

TL; DR : in short, here it is on the githab, there is more vivid


Here we see that when we call .Stroke type-parameters, we specify the entities (mapped on the table) with which we will work. They become arguments in the subsequent lambda expression. In short, Stroke passes the lambda passed to it through the parser, turning {x} into tables, and {x.Property} into the corresponding column name.


Something like this. Now let's review the details.


Mapping classes and properties with tables and columns


Let's refresh your knowledge of Reflection: imagine that you have a class (more precisely Type ) and you have a string with the name property from this class. There is also a heir to the EF- DbContext . Having these two forks and sneakers, you need to get the name of the table on which your class is mapped and the name of the column in the database your proper map is on. Immediately make a reservation: the solution to this problem will be different in EF Core, but this does not affect the main idea of ​​the article. So I suggest the reader to independently implement / google the solution to this problem.


So, EF 6. The required can be obtained through the very popular magic of casting an EF context to an IObjectContextAdapter :


 public static void GetTableName(this DbContext context, Type t) { //     ObjectContext- var objectContext = ((IObjectContextAdapter)context).ObjectContext; //   var metadata = objectContext.MetadataWorkspace; //        CLR- var objectItemCollection = ((ObjectItemCollection)metadata.GetItemCollection(DataSpace.OSpace)); //      .  EF-    var entityType = metadata.GetItems<EntityType>(DataSpace.OSpace) .FirstOrDefault(x => objectItemCollection.GetClrType(x) == t); //        var container = metadata .GetItems<EntityContainer>(DataSpace.CSpace) .Single() .EntitySets .Single(s => s.ElementType.Name == entityType.Name); //       - var mapping = metadata.GetItems<EntityContainerMapping>(DataSpace.CSSpace) .Single() .EntitySetMappings .Single(s => s.EntitySet == container); // ,      () var tableEntitySet = mapping .EntityTypeMappings.Single() .Fragments.Single() .StoreEntitySet; //    var tableName = tableEntitySet.MetadataProperties["Table"].Value ?? tableEntitySet.Name; //   return tableName; } 

And please, do not ask me what the EntityFramework developers are. smoked had in mind, creating such labyrinths of abstractions and what it means in every cornerstone. Honestly, I myself can get lost in this labyrinth and I didn’t write a piece above, I just found it and gutted it.


So, with the table sort of sorted out. Now the name of the column. Fortunately, it lies nearby, in the entity container mapping:


 public static void GetTableName(this DbContext context, Type t, string propertyName) { //      ,  var mappings = ... //       var columnName = mapping .EntityTypeMappings.Single() .Fragments.Single() .PropertyMappings .OfType<ScalarPropertyMapping>() .Single(m => m.Property.Name == propertyName) .Column .Name; // ,   ? return columnName; } 

So, and here I immediately warn the reader in large letters: delving into EF-metadata is slow! Other than jokes. Therefore, in general, cache everything you can reach. The article has a link to my code - there I was already preoccupied with caching - you can use. But still keep in mind: the real conceptual models of EF are stozoi monsters that keep platoons and divisions of various objects. If you need only the relation type-name of the table and type / property — the name of the column, then it is better to remove and cache once (just don’t force a memory leak there - do not store anything from DbContext ). In EF Core, they say, this is better.


Expressions


The most boring behind. Now - lambda expressions. Suppose we want to have a Stroke method, so that it could be called like this:


 context.Stroke<MyEntity>(x => $"UPDATE {x} WHERE {x.Age} > 10") 

The Stroke method itself is simple:


 public static void Stroke<T>(this DbContext s, Expression<Func<T, string>> stroke) { object[] pars = null; var sql = Parse(context, stroke, out pars); context.Database.ExecuteSqlCommand(sql, pars); } 

It is based on the Parse method, which does all the basic work. As you might guess, this method should parse the lambda expression obtained from the interpolation of a string. It's no secret that the sharp interpolation of strings is syntactic sugar for String.Format . Therefore, when you write $"String containing {varA} and {varB}" , the compiler converts this construct into a call to String.Format("String containing {0} and {1}", varA, varB) . The first parameter of this method is the format string . In it, we see placeholders with the naked eye - {0} , {1} and so on. Format simply replaces these placeholders with what comes after the format string, in the order indicated by the numbers in placeholders. If there are more placeholders than 4, then the interpolated string is compiled into a call to the String.Format overload from two parameters: the format string itself and the array, into which all sufferers get into the resulting string parameters are packed.


So what are we going to do in the Parse method now? We will tick out the original format string with tongs, and recalculate the formatting arguments, replacing where necessary with the names of tables and columns. Then we will call Format , and then we will assemble the original format string and the processed arguments into the resulting SQL string. Honestly, it is much easier to code than to explain :)


So, let's begin:


 public static string Parse(DbContext context, LambdaExpression query, out object[] parameters){ //       const string err = ",  !"; var bdy = query.Body as MethodCallExpression; //     ? if (bdy == null) throw new Exception(err); //    -  String.Format? if (bdy.Method.DeclaringType != typeof(String) || bdy.Method.Name != "Format") { throw new Exception(err); } 

As you know, lambda expressions in C # are literally expressions. That is, everything that goes after => should be one and only one expression. In delegates, you can push operators and separate them with a semicolon. But when you write Expression<> - everything. From now on, you limit the input data to one and only one expression. This is what happens in our Stroke method. LambdaExpression is an ancestor of Expression<> , only without generics unnecessary to us. Therefore, it would be necessary to make sure that the only expression that is contained in our query is the call to string.Format and nothing else that we did. Now we will look with what arguments it caused. Well, with the first argument, everything is clear - this is a format string. We extract it to the joy of all honest people:


  //     var fmtExpr = bdy.Arguments[0] as ConstantExpression; if (fmtExpr == null) throw new Exception(err); // ...    var format = fmtExpr.Value.ToString(); 

Next you need to make a small feint with your ears: as mentioned above, if the interpolated string has more than 4 placeholders, it is translated into a call to string.Format with two parameters, the second of which is an array (in the form new [] { ... } ). Let's process this situation:


  //  ,        // 1 -     -   int startingIndex = 1; //    var arguments = bdy.Arguments; bool longFormat = false; //       if (bdy.Arguments.Count == 2) { var secondArg = bdy.Arguments[1]; // ...    - new[] {...} if (secondArg.NodeType == ExpressionType.NewArrayInit) { var array = secondArg as NewArrayExpression; //          arguments = array.Expressions; //   startingIndex = 0; //  ,        longFormat = true; } } 

Now let's go through the resulting arguments collection and, finally, convert each argument that is associated with our lambda parameters into a table / column name, and anything that is not a reference to tables and columns, we calculate and drop it into the parameter list of the query, leaving the parameters format {i} , where i is the index of the corresponding parameter. Nothing new for experienced users of ExecuteSqlCommand .


  //        //   string.Format List<string> formatArgs = new List<string>(); //   -   List<object> sqlParams = new List<object>(); 

The first thing to do is a small technical feature of C # -powers lambdas: in view of strict typing, when you write, for example, x => "a" + 10 , the compiler wraps your ten in Convert - type conversion (obviously, to a string). Essentially, everything is correct, but during parsing lambdas this circumstance is very complicated. Therefore, here we will make a small Unconvert method that will check our argument for wrapping in Convert and, if necessary, expand:


 private static Expression Unconvert(Expression ex) { if (ex.NodeType == ExpressionType.Convert) { var cex = ex as UnaryExpression; ex = cex.Operand; } return ex; } 

Wonderful. Next, we need to understand whether the next argument is related to the parameters of the expression. Well, that is, it has the form p.Field1.Field2... , where p is a parameter of our expression (what is put before the lambda operator => ). Because if it does not, then this argument must be simply calculated, and the result remembered as a parameter of the SQL query, for subsequent feeding of the EF. The easiest and most clumsy way to determine whether we are referring to a field of any of the parameters is the following two methods:


In the first, we simply iterate over the chain of calls to members until we reach the root (I called it GetRootMember ):


 private static Expression GetRootMember(MemberExpression expr) { var accessee = expr.Expression as MemberExpression; var current = expr.Expression; while (accessee != null) { accessee = accessee.Expression as MemberExpression; if (accessee != null) current = accessee.Expression; } return current; } 

In the second, we actually check the conditions required by us:


 private static bool IsScopedParameterAccess(Expression expr) { //     -    {x},  ,   if (expr.NodeType == ExpressionType.Parameter) return true; var ex = expr as MemberExpression; //        -   if (ex == null) return false; //     var root = GetRootMember(ex); // ,    if (root == null) return false; //     -  if (root.NodeType != ExpressionType.Parameter) return false; //       return true; } 

Is done. Go back to sorting the arguments:


  //  for (int i = startingIndex; i < arguments.Count; i++) { //   Convert var cArg = Unconvert(arguments[i]); //      / if (!IsScopedParameterAccess(cArg)) { //   - var lex = Expression.Lambda(cArg); //  var compiled = lex.Compile(); //  var result = compiled.DynamicInvoke(); //     {i},  i -   formatArgs.Add(string.Format("{{{0}}}", sqlParams.Count)); //     SQL- sqlParams.Add(result); //     continue; } 

Fine. We cut off all the parameters that are guaranteed not to be links to our tables / columns. The sqlParams list sqlParams be returned later via the out parameter - we will sqlParams it along with the result string to the context.Database.ExecuteSqlCommand second argument. In the meantime, we will process the links to the tables:


  //   {x},  if (cArg.NodeType == ExpressionType.Parameter) { //     ,    formatArgs.Add(string.Format("[{0}]", context.GetTableName(cArg.Type))) //      continue; } 

Here we will have to cut off the opportunity to turn to the aggregates, because how this will lead to the need to re-collectiveize the query JOIN -ami, which we technically can not do. So - alas and ah. If our argument is an access to a member, but not to a member of the expression parameter itself, then jingle, we can do nothing to help:


  var argProp = cArg as MemberExpression; if (argProp.Expression.NodeType != ExpressionType.Parameter) { var root = GetRootMember(argProp); throw new Exception(string.Format(",     {0}", root.Type)); } 

And finally, we can get our column name and add it to the revised format argument list.


  var colId = string.Format("[{0}]", context.GetColumnName(argProp.Member.DeclaringType, argProp.Member.Name)); formatArgs.Add(colId); //     - } 

Now that all the arguments have been enumerated, we can finally do the string.Format own and get a SQL string and an array of parameters ready for feeding ExecuteSqlCommand .


  var sqlString = string.Format(format, formatArgs.ToArray()); parameters = sqlParams.ToArray(); return sqlString; } 

Is done


Something like that. For the article, I intentionally simplified the code. In particular, the full version automatically substitutes table aliases, normally caches the names of tables and columns, and also contains overloads of .Stroke up to 8 parameters. You can read the full source code on my github . Behind this, I say goodbye and wish you every success in development.


Oh, and finally the polls:


')

Source: https://habr.com/ru/post/347820/


All Articles