Use Reflection.Emit to precompile expressions in MSIL

Hi, Habr! I present to your attention the translation of the article " Using Reflection.Emit to Precompile Expressions to MSIL " by Steve Marsh.

Introduction

The classes in this project allow you to analyze text expressions entered by the user and compile them into a .NET assembly. This build can be performed on the fly or stored in a DLL. Preliminary compilation of expressions allows for a high level of portability and makes it very efficient to evaluate user-entered logic. In addition, we can use Microsoft's ildasm.exe tool to open and check the generated MSIL base code. There are many interesting features that come with the .NET platform, in my opinion the Reflection.Emit namespace offers much more than you can find. The Reflection.Emit namespace allows you to create your own .NET code at runtime by dynamically creating .NET types and inserting MSIL instructions into the body. MSIL is the Microsoft intermediate language for the .NET platform. IL is what your C # code and VB.NET compiles and sends to the JIT compiler when you run .NET programs. MSIL is a very low-level language that is very fast, and working with it gives you exceptional control over your programs. I will not go into details about MSIL in this article, but there are several other resources available on the Internet, and if you are interested in learning more, I have included some links at the end of this article.

reference Information

Let's take a quick look at what our parser / compiler will do. The user enters a string expression corresponding to the grammar of our parser. This expression will be turned into a tiny .NET program that will run and
output the result. For this, the analyzer will read the sequential list of characters and break it into a hierarchical tree, as shown below. Nodes are evaluated in the indicated order. When a node is matched, the corresponding command is invoked for this type of node. For example, when the number matches, we send this number to the stack. When the token "*" is matched, we invoke the multiplication instruction and so on. Adding all the instructions in the correct order gives us the “program” shown on the right.

')
Now let's see how our program executes and compares it with the original text expression. The first two teams insert integers 3 and 2 onto the stack. The multiply command extracts these two values from the stack, multiplies them and sends the result 6 back to the stack. Instruction No. 4 sends an integer 1 to the stack. Instruction No. 5 prints two values (6 and 1), adds them and returns the result (7) back to the stack. Finally, the return command returns the value 7 from the stack and returns it as a result. Brilliant! This may seem simple and obvious to most programmers, but this clever idea is largely the basis for programming and compiling, and I think it’s worth a look. This is what this program looks like in MSIL. For example, ldc.r8 is a load constant command and loads double 3.0 onto the stack.

IL_0000: ldc.r8 3. IL_0009: ldc.r8 2. IL_0012: mul IL_0013: ldc.r8 1. IL_001c: add IL_0023: ret

Code usage

This project contains two classes for parsing an expression and compiling it in MSIL. The first class is RuleParser, which is an abstract parsing class that contains all the lexing and parsing logic for our particular grammar. This class parses the message, but takes no action. The above code snippet shows that when the ttAdd token is detected, the parser calls the matchAdd () method, which is an abstract method defined in the RuleParser class. The implementation of the class method and the corresponding semantic action depends on the particular class. This template allows us to implement a separate concrete class for processing semantic actions and means that we can implement different specific classes depending on what we are trying to perform. This code was previously configured to evaluate expressions on the fly by calculating nodes as they were found. Now we can exchange our MsilParser for compiling an expression into an IL program using the same parser class. MsilParser does this by implementing all the necessary functions of the token and emitting the appropriate IL instructions. For example, the matchAdd () function simply inserts an Add command. When a variable is matched, we load the variable name with the Ldstr command, and then call the GetVar method.

 protected override void matchAdd() { this.il.Emit(OpCodes.Add); } protected override void matchVar() { string s = tokenValue.ToString(); il.Emit(OpCodes.Ldstr, s); il.Emit(OpCodes.Call, typeof(MsilParser).GetMethod( "GetVar", new Type[] { typeof(string) })); }

After installing all the tokens, we can call the CompileMsil () method of our MsilParser class, which runs the parser and returns the compiled .NET type using the AssemblyBuilder classes in the Relection.Emit namespace.

 /// <summary> /// Builds and returns a dynamic assembly /// </summary> public Type CompileMsil(string expr) { // Build the dynamic assembly string assemblyName = "Expression"; string modName = "expression.dll"; string typeName = "Expression"; string methodName = "RunExpression"; AssemblyName name = new AssemblyName(assemblyName); AppDomain domain = System.Threading.Thread.GetDomain(); AssemblyBuilder builder = domain.DefineDynamicAssembly( name, AssemblyBuilderAccess.RunAndSave); ModuleBuilder module = builder.DefineDynamicModule (modName, true); TypeBuilder typeBuilder = module.DefineType(typeName, TypeAttributes.Public | TypeAttributes.Class); MethodBuilder methodBuilder = typeBuilder.DefineMethod(methodName, MethodAttributes.HideBySig | MethodAttributes.Static | MethodAttributes.Public, typeof(Object), new Type[] { }); // Create the ILGenerator to insert code into our method body ILGenerator ilGenerator = methodBuilder.GetILGenerator(); this.il = ilGenerator; // Parse the expression. This will insert MSIL instructions this.Run(expr); // Finish the method by boxing the result as Double this.il.Emit(OpCodes.Conv_R8); this.il.Emit(OpCodes.Box, typeof(Double)); this.il.Emit(OpCodes.Ret); // Create and save the Assembly and return the type Type myClass = typeBuilder.CreateType(); builder.Save(modName); return myClass; }

The end result is a .NET build that can be executed, cached, or saved to disk. Look at the IL code for our method, which was created by our compiler:

 .method public hidebysig static object RunExpression() cil managed { // Code size 36 (0x24) .maxstack 2 IL_0000: ldc.r8 3. IL_0009: ldc.r8 2. IL_0012: mul IL_0013: ldc.r8 1. IL_001c: add IL_001d: conv.r8 IL_001e: box [mscorlib]System.Double IL_0023: ret } // end of method Expression::RunExpression

The main advantage of this approach is that parsing the expression takes much more time than just executing instructions. By precompiling an expression in IL, we only need to parse the expression once, not every time it is evaluated. Although this example uses only one expression, the actual implementation may include thousands of expressions precompiled and executed on demand. In addition, we also have our code packed in a good .NET DLL, and we can do whatever we want. This example can be estimated more than 1 million times in less than 3 hundredths of a second!

Sample Project Use

The sample project allows you to enter an expression in the upper left text box. When you click Analysis, the form will parse the expression and create a .NET assembly with your compiled code in the RunExpression () function. Then the program will call this function a certain number of times and show how long it took to execute it. Finally, the program will save the assembly as expression.dll and run Microsoft's ildasm.exe file to output the complete MSIL code for the assembly, so that you can see the code that was generated for your program.

Matters of interest

How our dynamic method is called will significantly affect performance. For example, simply using the Invoke () method in a dynamic method will significantly slow performance when called 1 million times. Using a generic delegate subscription, as in the code below, gives us about a 20-fold increase in performance.

 // Parse the expression and build our dynamic method MsilParser em = new MsilParser(); Type t = em.CompileMsil(textBox1.Text); // Get a typed delegate reference to our method. This is very // important for efficient calls! MethodInfo m = t.GetMethod("RunExpression"); Delegate d = Delegate.CreateDelegate(typeof(MsilParser.ExpressionInvoker<>), m); MsilParser.ExpressionInvoker<> method = (MsilParser.ExpressionInvoker<>)d; // Call the function Object result = method();

* in empty angle brackets must be Object.

Call ILDASM.EXE

The sample project will also allow you to view the entire MSIL code for your newly created assembly. It does this by calling ildasm.exe in the background and outputting the result in a text field. Ildasm.exe is a very useful tool for those who work with IL code or the System.Reflection.Emit namespace. The code below shows how to use this executable file in your program using the System.Diagnostics namespace. Check out the Microsoft documentation for ildasm.exe at the links below.

 // Save the Assembly and generate the MSIL code with ILDASM.EXE string modName = "expression.dll"; Process p = new Process(); p.StartInfo.FileName = "ildasm.exe"; p.StartInfo.Arguments = "/text /nobar \"" + modName; p.StartInfo.UseShellExecute = false; p.StartInfo.CreateNoWindow = true; p.StartInfo.RedirectStandardOutput = true; p.StartInfo.WindowStyle = ProcessWindowStyle.Hidden; p.Start(); string s = p.StandardOutput.ReadToEnd(); p.WaitForExit(); p.Close(); txtMsil.Text = s;

References:

Source: https://habr.com/ru/post/351498/

All Articles