Code injection into .NET CLR: IL code change during program execution

Foreword

Changing the .NET method of MSIL code during application execution is very cool. This is so cool that you can hook functions, hook up your software and other amazing things. That is why I have long wanted to accomplish this, but there was one problem - the MSIL code is compiled into machine code using JIT before we can do anything with this code. And since the .NET CLR is not documented and changes from version to version, we will look for a stable and reliable path, independent of the exact location of addresses in memory.

Finally, after a week of research, I did it. Your attention is given a simple method:

protected string CompareOneAndTwo() { int a = 1; int b = 2; if (a < b) { return "Number 1 is less than 2"; } else { return "Number 1 is greater than 2 (O_o)"; } }

As you can see, it returns "Number 1 is less than 2". Let's try to correct this misunderstanding and try to change this method so that the returned result is “Number 1 is greater than 2 (O_o)”.
By looking at the MSIL code for this method, we can achieve our goal by replacing the Bge_S opcode with Blt_S.

And if you run the demo application, it will show you the wrong result.

Below is the code to replace IL. I think that there is enough comments to understand the code.

 //    Type type=this.GetType(); MethodInfo methodInfo=type.GetMethod("CompareOneAndTwo", BindingFlags.NonPublic|BindingFlags.Instance); // ,   ,  //     ,   ,  JIT //     ,       JIT-  :) RuntimeHelpers.PrepareMethod(methodInfo.MethodHandle); //  IL-   byte[] ilCodes=methodInfo.GetMethodBody().GetILAsByteArray(); //       //      for (int i=0; i<ilCodes.Length; i++) { if (ilCodes[i]==OpCodes.Bge_S.Value) { // Bge_S  Blt_S ilCodes[i]=(byte)OpCodes.Blt_S.Value; } } // IL- InjectionHelper.UpdateILCodes(methodInfo, ilCodes);

You can download the demo program and try it.

Supports .NET from 2.0 to 4.0
Supports many types of methods, including dynamic and generic methods.
Supports release versions of .NET applications
Supports x86 and x64

Code use

Copy the InjectionHelper.cs, which contains the necessary methods, into your project.

 public static class InjectionHelper { // Load the unmanaged injection.dll, the initlaization happens in a background thread // you can check if the initialization is completed by GetStatus() public static void Initialize() // Unload the unmanaged injection.dll public static void Uninitialize() // Update the IL Code of a Method. public static void UpdateILCodes(MethodInfo method, byte[] ilCodes) // The method returns until the initialization is completed public static Status WaitForIntializationCompletion() // Query the current status of the unmanaged dll, returns immediately. public static Status GetStatus() }

The Injectionhelper :: Initialize method loads injection.dll, consisting of unmanaged code, from the directory in which the assembly resides, so all the files you want to modify should be in the same place. Or you can fix the source, as you prefer :)
List of files:

File name	Description
Injection32.dll	Unmanaged dll performing our task (x86 version)
Injection64.dll	Unmanaged dll performing our task (version x64)
EasyHook32.dll	x86 EasyHook DLL (http://easyhook.codeplex.com/) (using Injection32.dll)
EasyHook64.dll	X64 EasyHook DLL (http://easyhook.codeplex.com/) (used by Injection64.dll)
x86 / *	Windows Debug Tool for x86
x64 / *	Windows Debug Tool for x64
PDB_symbols / *	PDB files. They can be removed, but this will slow initialization.

Behind the scenes

Let's first take a look at how the CLR and JIT work.

The library describing JIT (clr.dll for .NET 4.0 / mscorwks.dll for .NET 2.0+) provides the _stdcall getJit method, which returns the ICorJitCompiler interface.
A library describing the CLR (clr.dll for .NET 4.0 / mscorwks.dll for .NET 2.0+) calls the getJit method to get the ICorJitCompiler interface

 CorJitResult compileMethod(ICorJitInfo * pJitInfo, CORINFO_METHOD_INFO * pMethodInfo, UINT nFlags, LPBYTE * pEntryAddress, ULONG * pSizeOfCode);

This part is easy, you just need to find the address of the compileMethod method and replace it with EasyHook.

 // ICorJitCompiler interface from JIT dll class ICorJitCompiler { public: typedef CorJitResult (__stdcall ICorJitCompiler::*PFN_compileMethod)(ICorJitInfo * pJitInfo, CORINFO_METHOD_INFO * pMethodInfo, UINT nFlags, LPBYTE * pEntryAddress, ULONG * pSizeOfCode); CorJitResult compileMethod(ICorJitInfo * pJitInfo, CORINFO_METHOD_INFO * pMethodInfo, UINT nFlags, LPBYTE * pEntryAddress, ULONG * pSizeOfCode) { return (this->*s_pfnComplieMethod)( pJitInfo, pMethodInfo, nFlags, pEntryAddress, pSizeOfCode); } private: static PFN_compileMethod s_pfnComplieMethod; }; //    LPVOID pAddr = tPdbHelper.GetJitCompileMethodAddress(); LPVOID* pDest = (LPVOID*)&ICorJitCompiler::s_pfnComplieMethod; *pDest = pAddr; //    compileMethod CorJitResult __stdcall CInjection::compileMethod(ICorJitInfo * pJitInfo , CORINFO_METHOD_INFO * pCorMethodInfo , UINT nFlags , LPBYTE * pEntryAddress , ULONG * pSizeOfCode ) { ICorJitCompiler * pCorJitCompiler = (ICorJitCompiler *)this; // TO DO:  IL-     CorJitResult result = pCorJitCompiler->compileMethod( pJitInfo, pCorMethodInfo, nFlags, pEntryAddress, pSizeOfCode); return result; } //    JIT-    NTSTATUS ntStatus = LhInstallHook( (PVOID&)ICorJitCompiler::s_pfnComplieMethod , &(PVOID&)CInjection::compileMethod , NULL , &s_hHookCompileMethod );

Change IL-code for JIT-compiled methods

The compileMethod method described above will not be called by the CLR for the JIT-compiled method. To solve this problem, I made saving CLR data structures and then restoring them before JIT compilation. And in this case, as soon as compileMethod is called again, we can replace IL.
Thus, we need to look a little at the implementation of CLR, SSCLI (Shared Source Common Language Infrastructure / Common Language Infrastructure) is a good source of information, but since it is rather outdated, we cannot use it in our code.

Yes, it is this diagram that is outdated, but the overall structure has been preserved. Each class in .NET has at least one MethodTable in memory. And each MethodTable structure is associated with an EEClass that stores runtime information for reflection and other purposes.
For each method, there is at least one MethodDesc structure containing information about flags, slot addresses, input addresses, etc.
Before the method is JIT-compliant, the slot points to the JMI converter, which switches the JIT compilation; When the IL code is compiled, a pointer to the JMI will be written into the slot, and the code will run directly on the compiled code when executed.
To restore the structure of information, you must first clear the flags, then modify the address of the entry point to a temporary one, etc. During testing, I did this by changing memory directly. But this is dirty, at least, because there is a dependence on the addresses of data structures and the code of different .NET versions is different.
I was looking for the right way, and fortunately, I found the MethodDesc :: Reset method in the SSCLI source code (vm / method.cpp).

 void MethodDesc::Reset() { CONTRACTL { THROWS; GC_NOTRIGGER; } CONTRACTL_END //      ,      . //            _ASSERTE(IsEnCMethod() || //     IsDynamicMethod() || GetLoaderModule()->IsReflection()); //    ClearFlagsOnUpdate(); if (HasPrecode()) { GetPrecode()->Reset(); } else { //      Reflection- _ASSERTE(GetLoaderModule()->IsReflection()); InterlockedUpdateFlags2(enum_flag2_HasStableEntryPoint | enum_flag2_HasPrecode, FALSE); *GetAddrOfSlotUnchecked() = GetTemporaryEntryPoint(); } _ASSERTE(!HasNativeCode()); }

As you can see, this code does what it takes. So I just need to call it for MethodDesc before JIT compilation.
Strictly speaking, I cannot use MethodDesc from SSCLI, since MethodDesc is used inside Microsoft, and no one knows what could be causing it.
Fortunately, the address of this internal method exists in the PDB from the Microsoft Symbol server, and this solves my problem. The address of the Reset () method in the CLR DLL can be found simply by parsing the PDB!
Now one important parameter remains - this is the pointer this to MethodDesc. Getting it is not so difficult. In general, MethodBase.MethodHandle.Value == CORINFO_METHOD_HANDLE == address MethodDesc == this pointer to MethodDesc.

 class MethodDesc { typedef void (MethodDesc::*PFN_Reset)(void); typedef BOOL (MethodDesc::*PFN_IsGenericMethodDefinition)(void); typedef ULONG (MethodDesc::*PFN_GetNumGenericMethodArgs)(void); typedef MethodDesc * (MethodDesc::*PFN_StripMethodInstantiation)(void); typedef BOOL (MethodDesc::*PFN_HasClassOrMethodInstantiation)(void); typedef BOOL (MethodDesc::*PFN_ContainsGenericVariables)(void); typedef MethodDesc * (MethodDesc::*PFN_GetWrappedMethodDesc)(void); typedef AppDomain * (MethodDesc::*PFN_GetDomain)(void); typedef Module * (MethodDesc::*PFN_GetLoaderModule)(void); public: void Reset(void) { (this->*s_pfnReset)(); } BOOL IsGenericMethodDefinition(void) { return (this->*s_pfnIsGenericMethodDefinition)(); } ULONG GetNumGenericMethodArgs(void) { return (this->*s_pfnGetNumGenericMethodArgs)(); } MethodDesc * StripMethodInstantiation(void) { return (this->*s_pfnStripMethodInstantiation)(); } BOOL HasClassOrMethodInstantiation(void) { return (this->*s_pfnHasClassOrMethodInstantiation)(); } BOOL ContainsGenericVariables(void) { return (this->*s_pfnContainsGenericVariables)(); } MethodDesc * GetWrappedMethodDesc(void) { return (this->*s_pfnGetWrappedMethodDesc)(); } AppDomain * GetDomain(void) { return (this->*s_pfnGetDomain)(); } Module * GetLoaderModule(void) { return (this->*s_pfnGetLoaderModule)(); } private: static PFN_Reset s_pfnReset; static PFN_IsGenericMethodDefinition s_pfnIsGenericMethodDefinition; static PFN_GetNumGenericMethodArgs s_pfnGetNumGenericMethodArgs; static PFN_StripMethodInstantiation s_pfnStripMethodInstantiation; static PFN_HasClassOrMethodInstantiation s_pfnHasClassOrMethodInstantiation; static PFN_ContainsGenericVariables s_pfnContainsGenericVariables; static PFN_GetWrappedMethodDesc s_pfnGetWrappedMethodDesc; static PFN_GetDomain s_pfnGetDomain; static PFN_GetLoaderModule s_pfnGetLoaderModule; };

Static variables store the addresses of the internal MethodDesc methods and they are initialized when the unmanaged DLL is loaded. And public methods just call internal methods.
Now we can easily call Microsoft internal methods:

 MethodDesc * pMethodDesc = (MethodDesc*)pMethodHandle; pMethodDesc->Reset();

Finding addresses of internal methods in the PDB file

When the unmanaged DLL loads, it checks the version of the CLR / JIT environment in which it is loaded. And it also tries to get the addresses of the internal methods from the PDB file. If they could not be found, she will try to start symchk.exe from the Windows Debug Tools in order to download the corresponding PDB files from the Microsoft Symbol server. This procedure takes quite a long time, from a few seconds to a few minutes. Perhaps we can speed up this process by caching the address of the CLR / JIT libraries by counting their hashes.

Recovery method to non-JIT compiled

Now everything is ready. The unmanaged library exports the methods for the managed code, accepts IL codes and MethodBase.MethodHandle.Value from the managed code.

 // structure to store the IL code for replacement typedef struct _ILCodeBuffer { LPBYTE pBuffer; DWORD dwSize; } ILCodeBuffer, *LPILCodeBuffer; // method to be called by managed code BOOL CInjection::StartUpdateILCodes( MethodTable * pMethodTable , CORINFO_METHOD_HANDLE pMethodHandle , mdMethodDef md , LPBYTE pBuffer , DWORD dwSize ) { MethodDesc * pMethodDesc = (MethodDesc*)pMethodHandle; // reset this MethodDesc pMethodDesc->Reset(); ILCodeBuffer tILCodeBuffer; tILCodeBuffer.pBuffer = pBuffer; tILCodeBuffer.dwSize = dwSize; tILCodeBuffer.bIsGeneric = FALSE; // save the IL code for the method s_mpILBuffers.insert( std::pair< CORINFO_METHOD_HANDLE, ILCodeBuffer>( pMethodHandle, tILCodeBuffer) ); return TRUE; }

This code simply calls Reset () and saves the IL codes in the map that the compileMethod will use when the method is compiled.
And in the compileMethod, just replace the IL code:

 CorJitResult __stdcall CInjection::compileMethod(ICorJitInfo * pJitInfo , CORINFO_METHOD_INFO * pCorMethodInfo , UINT nFlags , LPBYTE * pEntryAddress , ULONG * pSizeOfCode ) { ICorJitCompiler * pCorJitCompiler = (ICorJitCompiler *)this; LPBYTE pOriginalILCode = pCorMethodInfo->ILCode; unsigned int nOriginalSize = pCorMethodInfo->ILCodeSize; ILCodeBuffer tILCodeBuffer = {0}; MethodDesc * pMethodDesc = (MethodDesc*)pCorMethodInfo->ftn; // find the method to be replaced std::map< CORINFO_METHOD_HANDLE, ILCodeBuffer>::iterator iter = s_mpILBuffers.find((CORINFO_METHOD_HANDLE)pMethodDesc); if( iter != s_mpILBuffers.end() ) { tILCodeBuffer = iter->second; pCorMethodInfo->ILCode = tILCodeBuffer.pBuffer; pCorMethodInfo->ILCodeSize = tILCodeBuffer.dwSize; } CorJitResult result = pCorJitCompiler->compileMethod( pJitInfo, pCorMethodInfo, nFlags, pEntryAddress, pSizeOfCode); return result; }

Generic method

The generic method is mapped in memory in MethodDesc. But calling the Generic method with different types of parameters can force the CLR to create different entities of the same method.
The line below is a simple generic from the demo program.

 string GenericMethodToBeReplaced<T, K>(T t, K k)

By calling GenericMethodToBeReplaced <string, int> (“11”, 2) for the first time, the CLR creates an object of type InstantiatedMethodDesc (child of MethodDesc, and its flag is marked as mcInstantied), which stores in the InstMethodHashTable method data structure.
And by calling GenericMethodToBeReplaced <long, int> (1, 2), the CLR creates another InstantiatedMethodDesc object.
Therefore, we need to find all the InstantiatedMethodDesc generic method and reset it.
In the SSCLI source code (vm / proftoeeinterfaceimpl.cpp) there is a LoadedMethodDescIterator class that we can use. It takes three parameters as input and searches for methods by method ID (MethodToken).

 LoadedMethodDescIterator MDIter(ADIter.GetDomain(), pModule, methodId); while(MDIter.Next()) { MethodDesc * pMD = MDIter.Current(); if (pMD) { _ASSERTE(pMD->IsIL()); pMD->SetRVA(rva); } }

Note that the addresses of the constructor, the methods Next, Current we can get from the PDB file.

Not so scary that we do not know the exact size of the LoadedMethodDescIterator, just allocate a large block of memory for its storage.

 class LoadedMethodDescIterator { private: BYTE dummy[10240]; };

I would also like to note that there have been minor changes in the Next () method when the .NET transitions from version 2.0 to 4.5.

 // .Net 2.0 & 4.0 LoadedMethodDescIterator(AppDomain * pAppDomain, Module *pModule, mdMethodDef md) BOOL LoadedMethodDescIterator::Next(void) // .Net 4.5 LoadedMethodDescIterator(AppDomain * pAppDomain, Module *pModule, mdMethodDef md,enum AssemblyIterationMode mode) BOOL LoadedMethodDescIterator::Next(CollectibleAssemblyHolder<DomainAssembly *> *)

Therefore, we need to define the current version of the .NET framework in order to correctly call the method.

 // detect the version of CLR BOOL DetermineDotNetVersion(void) { WCHAR wszPath[MAX_PATH] = {0}; ::GetModuleFileNameW( g_hClrModule, wszPath, MAX_PATH); CStringW strPath(wszPath); int nIndex = strPath.ReverseFind('\\'); if( nIndex <= 0 ) return FALSE; nIndex++; CStringW strFilename = strPath.Mid( nIndex, strPath.GetLength() - nIndex); if( strFilename.CompareNoCase(L"mscorwks.dll") == 0 ) { g_tDotNetVersion = DotNetVersion_20; return TRUE; } if( strFilename.CompareNoCase(L"clr.dll") == 0 ) { DWORD dwHandle = NULL; UINT nSize = 0; LPBYTE lpBuffer = NULL; BYTE szTempBuf[2048] = {0}; DWORD dwSize = GetFileVersionInfoSizeW( wszPath, &dwHandle); if (dwSize != NULL) { LPVOID pData = szTempBuf; if (GetFileVersionInfo( wszPath, dwHandle, dwSize, pData)) { if (VerQueryValueW( pData, L"\\",(VOID FAR* FAR*)&lpBuffer,&nSize)) { if (nSize) { VS_FIXEDFILEINFO * pVerInfo = (VS_FIXEDFILEINFO *)lpBuffer; if (pVerInfo->dwSignature == 0xfeef04bd) { int nMajor = HIWORD(pVerInfo->dwFileVersionMS); int nMinor = LOWORD(pVerInfo->dwFileVersionMS); int nBuildMajor = HIWORD(pVerInfo->dwFileVersionLS); int nBuildMinor = LOWORD(pVerInfo->dwFileVersionLS); if( nMajor == 4 && nMinor == 0 && nBuildMajor == 30319 ) { if( nBuildMinor < 10000 ) g_tDotNetVersion = DotNetVersion_40; else g_tDotNetVersion = DotNetVersion_45; return TRUE; } } } } } return FALSE; } } return FALSE; }

Now we can declare our LoadMethodDescIterator, which will work with the CLR.

 enum AssemblyIterationMode { AssemblyIterationMode_Default = 0 }; class LoadedMethodDescIterator { typedef void (LoadedMethodDescIterator::*PFN_LoadedMethodDescIteratorConstructor)(AppDomain * pAppDomain, Module *pModule, mdMethodDef md); typedef void (LoadedMethodDescIterator::*PFN_LoadedMethodDescIteratorConstructor_v45)(AppDomain * pAppDomain, Module *pModule, mdMethodDef md, AssemblyIterationMode mode); typedef void (LoadedMethodDescIterator::*PFN_Start)(AppDomain * pAppDomain, Module *pModule, mdMethodDef md); typedef BOOL (LoadedMethodDescIterator::*PFN_Next_v4)(LPVOID pParam); typedef BOOL (LoadedMethodDescIterator::*PFN_Next_v2)(void); typedef MethodDesc* (LoadedMethodDescIterator::*PFN_Current)(void); public: LoadedMethodDescIterator(AppDomain * pAppDomain, Module *pModule, mdMethodDef md) { memset( dummy, 0, sizeof(dummy)); memset( dummy2, 0, sizeof(dummy2)); if( s_pfnConstructor ) (this->*s_pfnConstructor)( pAppDomain, pModule, md); if( s_pfnConstructor_v45 ) (this->*s_pfnConstructor_v45)( pAppDomain, pModule, md, AssemblyIterationMode_Default); } void Start(AppDomain * pAppDomain, Module *pModule, mdMethodDef md) { (this->*s_pfnStart)( pAppDomain, pModule, md); } BOOL Next() { if( s_pfnNext_v4 ) return (this->*s_pfnNext_v4)(dummy2); if( s_pfnNext_v2 ) return (this->*s_pfnNext_v2)(); return FALSE; } MethodDesc* Current() { return (this->*s_pfnCurrent)(); } private: // we don't know the exact size of LoadedMethodDescIterator, so add enough memory here BYTE dummy[10240]; // class CollectibleAssemblyHolder<class DomainAssembly *> parameter for Next() in .Net4.0 and above BYTE dummy2[10240]; // constructor for .Net2.0 & .Net 4.0 static PFN_LoadedMethodDescIteratorConstructor s_pfnConstructor; // constructor for .Net4.5 static PFN_LoadedMethodDescIteratorConstructor_v45 s_pfnConstructor_v45; static PFN_Start s_pfnStart; static PFN_Next_v4 s_pfnNext_v4; static PFN_Next_v2 s_pfnNext_v2; static PFN_Current s_pfnCurrent; public: static void MatchAddress(PSYMBOL_INFOW pSymbolInfo) { LPVOID* pDest = NULL; if( wcscmp( L"LoadedMethodDescIterator::LoadedMethodDescIterator", pSymbolInfo->Name) == 0 ) { switch(g_tDotNetVersion) { case DotNetVersion_20: case DotNetVersion_40: pDest = (LPVOID*)&(LoadedMethodDescIterator::s_pfnConstructor); break; case DotNetVersion_45: pDest = (LPVOID*)&(LoadedMethodDescIterator::s_pfnConstructor_v45); break; default: ATLASSERT(FALSE); return; } } else if( wcscmp( L"LoadedMethodDescIterator::Next", pSymbolInfo->Name) == 0 ) { switch(g_tDotNetVersion) { case DotNetVersion_20: pDest = (LPVOID*)&(LoadedMethodDescIterator::s_pfnNext_v2); break; case DotNetVersion_40: case DotNetVersion_45: pDest = (LPVOID*)&(LoadedMethodDescIterator::s_pfnNext_v4); break; default: ATLASSERT(FALSE); return; } } else if( wcscmp( L"LoadedMethodDescIterator::Start", pSymbolInfo->Name) == 0 ) pDest = (LPVOID*)&(LoadedMethodDescIterator::s_pfnStart); else if( wcscmp( L"LoadedMethodDescIterator::Current", pSymbolInfo->Name) == 0 ) pDest = (LPVOID*)&(LoadedMethodDescIterator::s_pfnCurrent); if( pDest ) *pDest = (LPVOID)pSymbolInfo->Address; } };

Finally, use a LoadedMethodDescIterator to call Reset () on MethodDesc for generic methods.

 Module * pModule = pMethodDesc->GetLoaderModule(); AppDomain * pAppDomain = pMethodDesc->GetDomain(); if( pModule ) { LoadedMethodDescIterator * pLoadedMethodDescIter = new LoadedMethodDescIterator( pAppDomain, pModule, md); while(pLoadedMethodDescIter->Next()) { MethodDesc * pMD = pLoadedMethodDescIter->Current(); if( pMD ) pMD->Reset(); } delete pLoadedMethodDescIter; }

Compilation and optimization

I found that if a method is very small and its size is only a few bytes, it will be compiled in inline mode. Therefore, MethodDesc :: Reset () does not help, because at run time, it will not even get to call this method. A little more information can be found in CEEInfo :: canInline (vm. \ / Jitinterface.cpp in SSCLI)

Dynamic methods

When adding dynamic methods to the IL code, you need to be very careful. Adding invalid IL code to other types of methods will only cause an InvalidApplicationException, but adding invalid IL code to a dynamic method can lead to a complete failure of the CLR and our process. Dynamic IL is different than others. The best way out is to generate the IL code from another dynamic method, and then copy and paste.

Source files

Source: https://habr.com/ru/post/154419/

All Articles