Code organization with introspection in the context of obfuscation and refactoring

The .NET platform provides a rich API for accessing metadata at runtime. But the introspection mechanism involves late binding to program elements by specifying their names and signatures through appropriate data structures. Such a code can lead to a change in the logic of the program to the wrong after refactoring (renaming, changing the order of parameters) or obfuscating metadata. The solution to this problem is to use the syntax sugar available in the framework of the Expression Trees technology and the C # language.

In order not to go far, for example, we write a simple class.

public class SimpleClass { private readonly string theValue; public SimpleClass(string value, int ratio) { theValue = value; } public string Value { get { return theValue; } } }

Add to this class properties that provide access to its metadata through the introspection mechanism:

 internal static PropertyInfo ValueProperty { get { return typeof(SimpleClass).GetProperty("Value"); } } internal static FieldInfo ValueField { get { return typeof(SimpleClass).GetField("theValue", BindingFlags.NonPublic | BindingFlags.Instance); } } internal static ConstructorInfo Constructor { get { return typeof(SimpleClass).GetConstructor(new[] { typeof(string), typeof( }); } }

If we output the values of these properties to the console, we get the string representations of these program elements. However, if you perform a permutation of the constructor's parameters using the refactoring engine built into the IDE (for example, in Visual Studio this is the Refactoring menu, the Reorder parameters item), then the property reflecting the constructor metadata will return a null link (ie, null). And if you also apply obfuscation to the assembly, then you will not be able to reflect the field and property in this way (with the exception of the ObfuscationAttribute attribute).
In order to get out of the situation, we arm ourselves with expression trees, lambda expressions and generalizations. From all this we glue the auxiliary method:

 static TMember Reflect<TDelegate, TMember>(Expression<TDelegate> memberAccess) where TDelegate : class where TMember : MemberInfo { if (memberAccess.Body is MemberExpression) return ((MemberExpression)memberAccess.Body).Member as TMember; else if (memberAccess.Body is NewExpression) return ((NewExpression)memberAccess.Body).Constructor as TMember; else if (memberAccess.Body is MethodCallExpression) return ((MethodCallExpression)memberAccess.Body).Method as TMember; else return null; }

This method takes two type parameters: the first accepts the delegate used to resolve the signature of the lambda expression to be passed as an argument, the second the type of the reflected member of the class (for example, FieldInfo for the field). Passing a lambda expression as an argument describes access to a member of the class that interests us. Inside the method, the body is accessed by a lambda expression, represented as an expression tree. Since the body is an access to a member of a class, we analyze the body type for possible options:

Access to a property or field;
Calling the operator new (this expression contains a reference to the constructor);
Method call

The main objective of this solution is to avoid classical reflection methods using binding flags, names of program elements and arrays that describe the signatures of methods and constructors, since these methods are not available for analysis by the refactoring engine and obfuscators.
Now everything is ready for writing a reflection code that changes automatically in the process of refactoring and is safe with obfuscation algorithms.

 internal static PropertyInfo ValueProperty { get { return Reflect<Func<SimpleClass, string>, PropertyInfo>(v => v.Value); } } internal static FieldInfo ValueField { get { return Reflect<Func<SimpleClass, string>, FieldInfo>(v => v.theValue); } } internal static ConstructorInfo Constructor { get { return Reflect<Func<string, int, SimpleClass>, ConstructorInfo>((a1, a2) => new SimpleClass(a1, a2)); } }

This approach does not use string literals to represent the names of class members and type arrays to describe the constructor signature.
Why does this code remain working after the obfuscator works on it? The answer lies in how the C # compiler generates code for expression trees. If you open the assembly in ILDASM, you can make sure that the LDTOKEN instruction is used to load the class member metadata, which operates with a numeric token indicating the location of the member metadata in the corresponding table inside the PE file.

There is always BUT

This method is only suitable for reflecting the program elements available in the current lexical scope.

Source: https://habr.com/ru/post/125912/

All Articles

Code organization with introspection in the context of obfuscation and refactoring

There is always BUT

More articles: