📜 ⬆️ ⬇️

Familiarity with the internal structure of the .NET Framework. Let's see how the CLR creates objects.

Attention of readers of "Habrakhabr" is the translation of the article to Khan Kommalapati and Tom Christian about the internal structure of .NET. There is an alternative translation on the Microsoft website.

The article deals with:


Technologies used: .NET Framework, C #
')

Content


  1. Domains created by the bootloader
  2. System domain
  3. Shared Domain (shared)
  4. Default Domain
  5. Heap loader
  6. Type Basics
  7. Object instance
  8. Method table
  9. Base copy size
  10. Method Slot Table
  11. Method descriptor
  12. Map of virtual interface method tables and interface map
  13. Virtual distribution
  14. Static variables
  15. EEClass
  16. Conclusion


Common Runtime (CLR) is becoming (or has already become) the main infrastructure for building applications in Windows, so having a deep understanding of its internal structure will help create efficient, industrial-grade applications.

In this article, we will explore the internal structure of the CLR, including the layout of an object instance, the layout of the method table, the distribution of methods, the interface distribution, and various data structures.

We will use very simple fragments of C # code, any implicit use of the programming language syntax implies C #. Some of the data structures and algorithms discussed will be changed in future versions of the Microsoft® .NET Framework, but the conceptual framework will remain the same. We will use the Visual Studio® .NET 2003 debugger and the Son of Strike (SOS) debugger extension to view the data structures discussed in the article. SOS loads internal CLR data, and allows you to view, save information of interest. See the procedure for loading SOS.dll in the debugger process in the appropriate sources.
See the "Son of Strike" sidebar for loading SOS.dll into the Visual Studio .NET 2003 debugger process.

In the article we will describe the classes corresponding to the implementations in the Shared Source CLI (SSCLI).

The table in Figure 1 will help in the study of megabytes of code in SSCLI while searching for the necessary structures.

Figure 1 SSCLI References
ComponentSSCLI Way
AppDomain/sscli/clr/src/vm/appdomain.hpp
AppDomainStringLiteralMap/sscli/clr/src/vm/stringliteralmap.h
BaseDomain/sscli/clr/src/vm/appdomain.hpp
Classloader/sscli/clr/src/vm/clsload.hpp
EEClass/sscli/clr/src/vm/class.h
FieldDescs/sscli/clr/src/vm/field.h
Gcheap/sscli/clr/src/vm/gc.h
GlobalStringLiteralMap/sscli/clr/src/vm/stringliteralmap.h
Handletable/sscli/clr/src/vm/handletable.h
InterfaceVTableMapMgr/sscli/clr/src/vm/appdomain.hpp
Large Object Heap/sscli/clr/src/vm/gc.h
LayoutKind/sscli/clr/src/bcl/system/runtime/interopservices/layoutkind.cs
Loaderheaps/sscli/clr/src/inc/utilcode.h
MethodDescs/sscli/clr/src/vm/method.hpp
MethodTables/sscli/clr/src/vm/class.h
OBJECTREF/sscli/clr/src/vm/typehandle.h
Securitycontext/sscli/clr/src/vm/security.h
Security descriptor/sscli/clr/src/vm/security.h
SharedDomain/sscli/clr/src/vm/appdomain.hpp
StructLayoutAttribute/sscli/clr/src/bcl/system/runtime/interopservices/attributes.cs
SyncTableEntry/sscli/clr/src/vm/syncblk.h
System namespace/ sscli / clr / src / bcl / system
Systemdomain/sscli/clr/src/vm/appdomain.hpp
Typehandle/sscli/clr/src/vm/typehandle.h


The moment you should pay attention to before we go further is that the information provided in this article is valid only for the .NET Framework 1.1 (it also basically corresponds to the Shared Source CLI 1.0, taking into account a number of notable exceptions present in various interaction scenarios) performance on x86 platform. The information has been changed in the next versions of the .NET Framework, so please do not build your applications with absolute references to these internal structures.

CLR-created domains


Before you run the first line of managed code, three application domains are created. Two of them are not available in managed code and are not even visible to the CLR host. They can only be created when booting the CLR with the mscoree.dll and mscorwks.dll bus (or mscorsvr.dll for multiprocessor systems). As you can see in Figure 2, this is a system domain and a shared (shared) domain, they can only exist in one instance. The third domain is the default, only an instance of this application domain has a name. For a simple CLR host, such as a console application, the default application domain name contains the name of the executable image. Additional domains can be created from managed code using the AppDomain.CreateDomain method or from an unmanaged code host using the ICORRuntimeHost interface.

Complex hosts, such as ASP.NET, create the required number of domains, according to the number of applications running in the serviced Web site.


Figure 2. Domains created by the CLR loader

System domain


The system domain creates and initializes the shared domain (SharedDomain) and the default domain (Default). It also loads the system library mscorlib.dll in the domain shared.

The system domain also contains string constants that are available internally, interned explicitly or not explicitly.

String interning is optimization functionality, a bit totalitarian in the .NET Framework 1.1 environment, since the CLR does not allow assemblies to optimize this function. In this case, the memory is used to store only one instance of the string for all string literals in all application domains.

The system domain also serves to generate interface identifiers within the process boundaries, which are used when creating an Interface Map (InterfaceVtableMaps) in each application domain (AppDomain).

The system domain tracks all domains in the process and provides the functionality of loading and unloading application domains.

SharedDomain Domain


The entire domain-neutral code is loaded into the shared domain. Mscorlib, the system library, is required for user code in all application domains (AppDomains). This library is automatically uploaded to the public domain. Base types from the System namespace, such as Object, ValueType, Array, Enum, String, and Delegate, are preloaded into this domain during the CLR boot process. The user code can also be loaded into this domain by setting the LoaderOptimization attributes by the CLR host during a CorBindToRuntimeEx call. The console application can load the code into the public domain by adding the System.LoaderOptimizationAttribute attribute to the Main method of the application. The shared domain also manages an assembly map indexed relative to the base address, the map acts as a table reference for managing the general dependencies of assemblies loaded into the default domain and other application domains created in managed code. The default domain serves only to load a private user code, which should not be available to other applications.

Default Domain


The default domain is an instance of the application domain, where application code is usually executed. While some applications require additional application domains to be created at runtime (such that they have plug-in architecture or applications that generate a significant amount of code at runtime), most applications create one domain at runtime. All code executed in this domain is contextually limited at the domain level. If an application has created multiple application domains, any cross-domain access will occur through the .NET Remoting proxy. Additional intra-domain boundaries can be created using types inherited from System.ContextBoundObject.

Each application domain has its own SecurityDescriptor, SecurityContext and DefaultContext, as well as its own heap loader (High-Frequency Heap, Low-Frequency Heap, and Stub Heap),
Descriptor Tables (Handle Table, Large Object Heap Handle Table), Vtable Integration Card Manager, and Assembly Cache.

Heap loader


Loader headers (LoaderHeaps) are designed to load various CLR runtime artifacts and optimization artifacts that exist throughout the life of the domain. These heaps are incremented by predictable fragments to minimize fragmentation. Loader heaps are different from garbage collector heaps (GC) (or heap sets in the case of SMP symmetric multiprocessors) in that the garbage collector heap contains instances of objects, and the loader heaps contain system types. Frequently requested structures, such as method tables, method descriptors (MethodDescs), field descriptors (FieldDescs), and an interface map, are located on a heap of frequent access (HighFrequencyHeap). Structures to which calls are more rare, such as EEClass and class loader (ClassLoader), as well as their service tables, are located on a heap with a low frequency of calls (LowFrequencyHeap). The service heap (StubHeap) contains blocks that provide support for access security in the code access security (CAS) code, COM call shell, and P / Invoke calls. Having reviewed the domains and heaps of high-level loaders, we now take a closer look at their physical organization in the context of a simple application in Figure 3. Stop the program at “mc.Method1 ();” and create a domain dump using the SOS debugger advanced command. Below is the result:

!DumpDomain System Domain: 793e9d58, LowFrequencyHeap: 793e9dbc, HighFrequencyHeap: 793e9e14, StubHeap: 793e9e6c, Assembly: 0015aa68 [mscorlib], ClassLoader: 0015ab40 </br> Shared Domain: 793eb278, LowFrequencyHeap: 793eb2dc, HighFrequencyHeap: 793eb334, StubHeap: 793eb38c, Assembly: 0015aa68 [mscorlib], ClassLoader: 0015ab40 </br> Domain 1: 149100, LowFrequencyHeap: 00149164, HighFrequencyHeap: 001491bc, StubHeap: 00149214, Name: Sample1.exe, Assembly: 00164938 [Sample1], ClassLoader: 00164a78 

Figure 3 Sample1.exe
 using System; public interface MyInterface1 { void Method1(); void Method2(); } public interface MyInterface2 { void Method2(); void Method3(); } class MyClass : MyInterface1, MyInterface2 { public static string str = "MyString"; public static uint ui = 0xAAAAAAAA; public void Method1() { Console.WriteLine("Method1"); } public void Method2() { Console.WriteLine("Method2"); } public virtual void Method3() { Console.WriteLine("Method3"); } } class Program { static void Main() { MyClass mc = new MyClass(); MyInterface1 mi1 = mc; MyInterface2 mi2 = mc; int i = MyClass.str.Length; uint j = MyClass.ui; mc.Method1(); mi1.Method1(); mi1.Method2(); mi2.Method2(); mi2.Method3(); mc.Method3(); } } 


Our console application, Sample1.exe, is loaded into the application domain (AppDomain), which is named “Sample1.exe”. Mscorlib.dll is loaded into the shared domain (SharedDomain), but also appears in the system domain (SystemDomain), like the kernel system library. A heap of high-frequency access (HighFrequencyHeap), low-frequency access (LowFrequencyHeap) and a stub-heap (StubHeap) are located in each domain. The system domain and the shared domain use the same class loader (ClassLoader), while the Default AppDomain uses its own.

The result of the command does not display the reserved and used size of the loader heaps. A bunch of high-frequency access initially reserves 32K and uses 4K.

A heap of low-frequency access stub heaps initially reserve 8Kb and occupy 4Kb.

Also, the heap of interface cards is not shown (InterfaceVtableMap, hereinafter IVMap) Each domain has an interface card that is created on its own loader heap during the domain initialization phase. The pile of interface cards (IVMap) reserves 4K and occupies 4K initially. We will discuss the significance of the interface map when we explore the layout of the type in the following sections.

Figure 2 shows the default heap process heap, the runtime compiler heap (JIT Code), the garbage collector heap (GC) for small objects (SOH) and the heap of large objects (LOH) (for objects with a size of 85,000 bytes or more ) to illustrate the semantic difference between them and the loader heaps. The JIT or runtime compiler generates instructions for the x86 architecture and saves them on the heap for JIT code. The heap of the garbage collector and the heap of large objects are heaps that are processed by the garbage collector, and managed objects are created on these heaps.

Type Basics


Type is a fundamental element of programming in .NET. In C #, a type can be declared using the following keywords: class, struct, and interface. Most types are created by the programmer explicitly, however, in special cases of interaction and in scripts of calling remote objects (.NET Remoting), the .NET CLR generates types implicitly. These generated types include COM and Callable Wrappers runtime wrappers and Transparent Proxies end-to-end proxies.

We explore the .NET fundamental types, starting with a stack structure that contains references to an object (as a rule, the stack is one of the places from which an instance of an object begins its existence).
The code in Figure 4 contains a simple program with a console entry point, where a static method is called.

Method1 creates an instance of the SmallClass type that contains an array of bytes used to demonstrate how to create an instance of an object on a heap of LOH large objects. The code is trivial, but will be involved in our discussion.

Figure 4 Large and small objects
 using System; class SmallClass { private byte[] _largeObj; public SmallClass(int size) { _largeObj = new byte[size]; _largeObj[0] = 0xAA; _largeObj[1] = 0xBB; _largeObj[2] = 0xCC; } public byte[] LargeObj { get { return this._largeObj; } } } class SimpleProgram { static void Main(string[] args) { SmallClass smallObj = SimpleProgram.Create(84930,10,15,20,25); return; } static SmallClass Create(int size1, int size2, int size3, int size4, int size5) { int objSize = size1 + size2 + size3 + size4 + size5; SmallClass smallObj = new SmallClass(objSize); return smallObj; } } 



Figure 5 shows a snapshot of a typical fastcall call stack stopped at a breakpoint on the line “return smallObj;” in the Create method. (Fastcall is a .NET call convention that defines that arguments are passed to functions in registers, when possible, with the remaining arguments passed through the stack from right to left and then retrieved from the stack by the called function
The local variable of a meaningful type or objSize value type is placed directly on the stack. Variables of the reference type, such as smallObj, are stored with a fixed occupied size (4-bit double word DWORD) on the stack and contain the address of the instances of objects placed in the regular garbage collector heap.

In traditional C ++, this is a pointer to an object; in the controlled programming world, this is a reference or object reference. However, it contains the address of the object instance. We will use the term object instance (ObjectInstance) for the data structure located at the address specified in the object reference.


Figure 5. SimpleProgram stack and heap

An instance of the smallObj object on the regular garbage collector heap contains a Byte [] pointing to _largeObj, whose size is 85,000 bytes (note that the figure shows 85,016 bytes, which is the actual size of the occupied area). The CLR handles objects larger than or equal to 85,000 bytes differently, unlike smaller objects. Large objects are located in a heap of large objects (LOH), while small objects are created in the usual heap of garbage collection, which optimizes the placement of objects and garbage collection. The LOH does not shrink, and the regular heap is compressed with each garbage collection. Moreover, LOH is cleared only with complete garbage collection.

An instance of smallObj contains a type handle pointing to the method table (MethodTable) of the corresponding type. There will be one method table for each declared one and all instances of objects of the same type will point to the same method table. The descriptor will also contain information about the type of type (interface, abstract class, specific class, COM wrapper, proxy), the number of implemented interfaces, the interface map for the distribution of methods, the number of slots in the method table, and a table of slots pointing to the implementation.

One important data structure points to the EEClass. The CLR class loader creates an EEClass from the metadata before the method table is formed. In Figure 4, the SmallClass method table points to its EEClass. These structures indicate their modules and assemblies. The method table and EEClass are usually located in the domain-specific loader heaps. Byte [] is a special case; The method table and the EEClass are located in the public domain loader heaps. Loader heaps refer to a specific domain (domain-specific) and any data structures mentioned earlier, once loaded, will not disappear until the domain is unloaded. Also, the default domain cannot be unloaded and therefore the code exists until the CLR is stopped.

Object instance


As we noted, all instances of value types are either embedded in the thread stack or embedded in the garbage collector heap. All reference types are created on a heap of a garbage collector or heap of large objects (LOH). Figure 6 shows a typical object instance layout. An object can be referenced by a local variable created on the stack, descriptor tables in external interaction situations and P / Invoke scripts, from registers (this can be this-specify and method arguments during the method execution) or from the finalizer queue for objects having final methods (finalizer methods). OBJECTREF does not indicate the beginning of an object instance, but points at an offset of 4 bytes (DWORD) from the beginning. A DWORD is called an object header and contains an index (the synblk number of the sync block starting with one) in the SyncTableEntry table. Since distribution occurs through the index, the CLR can move the table in memory when increasing the size is necessary. The SyncTableEntry maintains soft links back to the object, so possession of the sync block can be traced by the CLR. Soft links allow the garbage collector to clean up when other hard links no longer exist. SyncTableEntry also stores a pointer to a SyncBlock containing useful information, but less necessary for all instances of the object. This information includes object locks, its hash code, any conversion data, and a domain index (AppDomainIndex). For most instances of objects, there will be no space allocated for a sync block (SyncBlock) and the syncblock number will be zero. This will change when the executing thread stumbles upon the expression lock (obj) or obj.GetHashCode, as shown below:

 SmallClass obj = new SmallClass() // Do some work here lock(obj) { /* Do some synchronized work here */ } obj.GetHashCode(); 


Figure 6. Object instance representation

In this code, smallObj will use zero (no syncblk) as its number in the Syncblk Entry Table. The lock instruction causes the CLR to create a syncblock entry and write the corresponding number in the header. Since the lock keyword in C # is expanded into a try-catch block using the Monitor class, a Monitor object is created in SyncBlock for synchronization. Calling the GetHashCode () method fills the Hashcode field with the object hash code in SyncBlock.

SyncBlock contains other fields used in the interaction with COM and marshaling delegates to unmanaged code, but not related to the typical use of objects.

The type handler (TypeHandle) follows the syncblk number in the object instance. In order to maintain continuity of reasoning, I will discuss the type handler after clarifying instances of variables. A variable list of instance fields follows the type handle. By default, the instance fields are arranged in such a way that the memory usage is efficient and the gaps in the alignment are minimal. The code in Figure 7 contains a simple class SimpleClass having a set of instance variables contained in it, with different sizes.

Figure 7 SimpleClass with Instance Variables
 class SimpleClass { private byte b1 = 1; // 1 byte private byte b2 = 2; // 1 byte private byte b3 = 3; // 1 byte private byte b4 = 4; // 1 byte private char c1 = 'A'; // 2 bytes private char c2 = 'B'; // 2 bytes private short s1 = 11; // 2 bytes private short s2 = 12; // 2 bytes private int i1 = 21; // 4 bytes private long l1 = 31; // 8 bytes private string str = "MyString"; // 4 bytes (only OBJECTREF) //Total instance variable size = 28 bytes static void Main() { SimpleClass simpleObj = new SimpleClass(); return; } } 


Figure 8 contains an example instance of a SimpleClass object displayed in the memory window of the Visual Studio debugger. We set a breakpoint on the return statement, Figure 7, and used the address simpleObj contained in the ECX register to display an instance of the object in the memory browser. The first 4-byte block is the syncblk number. We do not use an instance in any code requiring synchronization (and do not use the HashCode method), therefore this field is set to 0. The object reference is stored in the stack variable, indicates 4 bytes located at offset 4. Byte variables b1, b2, b3 and b4 lie side by side. Byte variables b1, b2, b3, and b4 are all placed in a row, next to each other. Both variables of type short s1 and s2 are also placed side by side. The string variable str is a 4-byte ODJECTREF indicating the current instance of the string located in the garbage collector heap. A string (String) is a special type, all instances containing the same text will point to the same instance in the global string table — this is done during the assembly loading process. This process is called string interning and is designed to optimize memory usage. As we noted earlier in the .NET Framework 1.1, the assembly cannot disable the process of internment; perhaps in future versions of the CLR runtime, this feature will be provided.


Figure 8. Debug window displaying an instance of an object in memory

Thus, the lexical sequence of variable members in the source code is not supported in the default memory. In external interaction scenarios, where the lexical sequence is to be transferred to memory, the StructLayoutAttribute attribute can be used, which takes the value of the LayoutKind enumeration as an argument. LayoutKind.Sequential will provide the lexical sequence for marshalized data. In the .NET Framework, this will not affect the managed layout (in the .NET Framework 2.0, the use of the attribute will have an effect). In external interaction scenarios where you actually need to have additional offset and explicit control over the sequence of fields, LayoutKind.Explicit can be used in conjunction with the FieldOffset attribute at the field level. Looking at the immediate contents of the memory, let's use the SOS debugger to view the contents of the object instance. One useful command is DumpHeap, which allows you to output all the contents of the heap and all instances of a particular type. Instead of using registers, DumpHeap can show the address of the object we just created:

 !DumpHeap -type SimpleClass Loaded Son of Strike data table version 5 from "C:/WINDOWS/Microsoft.NET/Framework/v1.1.4322/mscorwks.dll" Address MT Size 00a8197c 00955124 36 Last good object: 00a819a0 total 1 objects Statistics: MT Count TotalSize Class Name 955124 1 36 SimpleClass 

The total size of the object is 36 bytes. It does not matter how long the string is, the SimpleClass instances contain only the DWORD OBJECTREF. SimpleClass instance variables take up only 28 bytes. The remaining 8 bytes include a TypeHandle type handler (4 bytes) and a syncblk sync block number (4 bytes). After receiving the simpleObj instance address, let's dump the contents of this instance using the DumpObj command, as shown here:

 !DumpObj 0x00a8197c Name: SimpleClass MethodTable 0x00955124 EEClass 0x02ca33b0 Size 36(0x24) bytes FieldDesc*: 00955064 MT Field Offset Type Attr Value Name 00955124 400000a 4 System.Int64 instance 31 l1 00955124 400000b c CLASS instance 00a819a0 str << some fields omitted from the display for brevity >> 00955124 4000003 1e System.Byte instance 3 b3 00955124 4000004 1f System.Byte instance 4 b4 

As noted, the default layout generated by the C # compiler for classes is LayoutType.Auto (for structures, LayoutType.Sequential is used); thus, the class loader reorders the instance fields to minimize offsets. We can use ObjSize to get the graph including the space occupied by the instance, str. Here is the conclusion:

! ObjSize 0x00a8197c
sizeof (00a8197c) = 72 (0x48) bytes (SimpleClass)

Son of strike
The SOS debug extension used to display the contents of the CLR data structures in this article. This is part of the .NET Framework installation package and is located at% windir% \ Microsoft .NET \ Framework \ v1.1.4322. Before loading SOS into the process, turn on controlled debugging in project properties in Visual Studio .NET. Add the directory where SOS.dll is located in the PATH environment variable. To boot SOS when stopping at a breakpoint, open Debug | Windows | Immediate. In the immediate window, run .load sos.dll. Use! Help to get a list of debugger commands. For more information on SOS, see the msdn Bugslayer column documentation.

If you subtract the size of the SimpleClass instance (36 bytes) from the entire size of the object graph (72 bytes), you will get the size of str, which is 36 bytes. Let's check it out by removing the str instance dump. Below is the output of the command:

 !DumpObj 0x00a819a0 Name: System.String MethodTable 0x009742d8 EEClass 0x02c4c6c4 Size 36(0x24) bytes 

If you add the instance size of the string str (36 bytes) to the instance size of the SimpleClass (36 bytes), you get a total size of 72 bytes, which corresponds to the output of the ObjSize command. Note that ObjSize will not include memory used by the syncblk infrastructure. Also, in the .NET Framework 1.1, the CLR is not aware of the memory occupied by any unmanaged resources, such as GDI objects, COM objects, file handlers, and so on; therefore, they will not be reflected by this command.
A type handler (TypeHandle), a pointer to the method table (MethodTable), is located right after the syncblk number. Before instantiating an object, the CLR looks at the loaded types and loads type information if the type is not found, gets the address of the method table, creates an object instance, and writes the value to the TypeHandle of the object instance. JIT compiled code by the compiler uses a TypeHandle type handler to find the MethodTable method table for distributing methods. The code compiled by the JIT compiler uses a type handler (TypeHandle) to position the method table (MethodTable) to distribute method calls. The CLR uses the type handler (TypeHandle) when you need to find the loaded type through the MethodTable method table.

MethodTable Method Table


Each class and interface, when uploaded to the application domain, will be represented in memory by the MethodTable data structure. This is the result of actions to load classes before creating the very first instance of an object. While an instance of an ObjectInstance object stores a state, MethodTable stores behavior information. MethodTable associates an object instance with in-memory metadata structures generated by the language compiler using the EEClass. Information in the MethodTable method table and the data structures attached to it can be accessed from managed code via System.Type. A pointer to the method table can also be obtained even in managed code through the Type.RuntimeTypeHandle property. A typeler handler contained in ObjectInstance indicates an offset from the beginning of the method table. This offset is 12 bytes by default and contains information for the garbage collector, which will not be discussed here.

Figure 9 shows a typical representation of a method table. We will show some important henler type fields, but use a drawing for a more complete list. Let's start with Base Instance Size, since it has a direct correlation with the runtime memory profile.


Figure 9 Representation of the method table

Base instance size Base Instance Size


The base size of an instance is the size of an object, calculated by the class loader, based on the field declarations in the code. As discussed earlier, the current implementation of the garbage collector requires an object instance size of at least 12 bytes. If the class does not have a single declared instance field, this will result in 4 bytes redundancy.

The remaining 8 bytes will be occupied by the header (Object Header) (which may contain the syncblk block number) and the type handler (TypeHandle). Again, the size of the object may be affected by the StructLayoutAttribute.

( Visual Studio .NET 2003 ) MyClass 3 (MyClass ) SOS . 9, 4- 12 (0x0000000C) . DumpHeap SOS:

 !DumpHeap -type MyClass Address MT Size 00a819ac 009552a0 12 total 1 objects Statistics: MT Count TotalSize Class Name 9552a0 1 12 MyClass 


(MethodDesc), . : , , , . , . , , . vtable. , .MyClass , (.cctor) (.ctor). C# . . 10 MyClass. 10 Method2 IVMap, . 11 SOS MyClass.


10 MyClass

11 SOS MyClass
 !DumpMT -MD 0x9552a0 Entry MethodDesc Return Type Name 0097203b 00972040 String System.Object.ToString() 009720fb 00972100 Boolean System.Object.Equals(Object) 00972113 00972118 I4 System.Object.GetHashCode() 0097207b 00972080 Void System.Object.Finalize() 00955253 00955258 Void MyClass.Method1() 00955263 00955268 Void MyClass.Method2() 00955263 00955268 Void MyClass.Method2() 00955273 00955278 Void MyClass.Method3() 00955283 00955288 Void MyClass..cctor() 00955293 00955298 Void MyClass..ctor() 


4 ToString, Equals, GetHashCode Finalize. System.Object. Method2 , . .cctor .ctor .


(MethodDesc) CLR. , , . , MethodDesc 3. MethodDesc (IL). MethodDesc PreJitStub, JIT . 12 . MethodDesc. 5- MethodDesc 8- , . 5 PreJitStub. 5- DumpMT ( MyClass 11) of SOS, MethodDesc 5 . JIT . 5 JIT x86.


Figure 12 The method descriptor

Disassembling the code pointed to by the entry in the method slot table in Figure 12 will show the call to PreJitStub. Here is the abbreviated disassembly output before JIT compilation for Method2 method:

 !u 0x00955263 Unmanaged code 00955263 call 003C3538 ;call to the jitted Method2() 00955268 add eax,68040000h ;ignore this and the rest ;as !u thinks it as code 

Now let's run the method and disassemble the same address:

 !u 0x00955263 Unmanaged code 00955263 jmp 02C633E8 ;call to the jitted Method2() 00955268 add eax,0E8040000h ;ignore this and the rest ;as !u thinks it as code 

5 ; Method2 . »!u" , 5- .

CodeOrIL JIT (RVA) (IL). . CLR JIT – . MethodDesc DumpMT JIT :

 !DumpMD 0x00955268 Method Name : [DEFAULT] [hasThis] Void MyClass.Method2() MethodTable 9552a0 Module: 164008 mdToken: 06000006 Flags : 400 IL RVA : 00002068 

, MethodDesc :

 !DumpMD 0x00955268 Method Name : [DEFAULT] [hasThis] Void MyClass.Method2() MethodTable 9552a0 Module: 164008 mdToken: 06000006 Flags : 400 Method VA : 02c633e8 

The flag field in the method descriptor is encoded to store information about the type of method, such as static, instance, interface method, or COM implementation.

Let's look at another complex aspect of the method table: interface implementation. It is designed to look simply at a manageable environment, understanding all the complexities in the presentation process. Next, we consider how the interfaces are located and how the distribution of interface methods actually works.

IVMap and Interfaces Map


12 , IVMap. 9, IVMap , . IVMap. MyInterface1 , IVMap. (MethodTable) MyClass, 9. . IVMap . . , IVMap . 28 (Interface Map ) InterfaceInfo . , MyClass. 4 InterfaceInfo (TypeHandle) MyInterface1 ( 9 10). (2 ) ( 0 1 ). , . MyInterface1 4, 5 6 . MyInterface2, 6, 7 8 . , , . MyClass MyInterface1.Method2 MyInterface2.Method2 .

The distribution of the interface method is done through IVMap, while the distribution of direct methods occurs via the Address MethodDesc stored in the corresponding slot. As noted earlier, the .NET Framework uses the fastcall call convention. The first two arguments are usually passed through the ECX and EDX registers, if possible. The first argument of the instance method is always "this" pointer, which is passed through the ECX register, as shown by the instruction "mov ecx, esi":

 mi1.Method1(); mov ecx,edi ;move "this" pointer into ecx mov eax,dword ptr [ecx] ;move "TypeHandle" into eax mov eax,dword ptr [eax+0Ch] ;move IVMap address into eax at offset 12 mov eax,dword ptr [eax+30h] ;move the ifc impl start slot into eax call dword ptr [eax] ;call Method1 mc.Method1(); mov ecx,esi ;move "this" pointer into ecx cmp dword ptr [ecx],ecx ;compare and set flags call dword ptr ds:[009552D8h];directly call Method1 

MyClass . JIT . IVMap , . IVMap, . , . 2, «mi1 = mc;» OBJECTREF mc mi1.


. MyClass.Method3 3:

 mc.Method3(); Mov ecx,esi ;move "this" pointer into ecx Mov eax,dword ptr [ecx] ;acquire the MethodTable address Call dword ptr [eax+44h] ;dispatch to the method at offset 0x44 

, (). , . , . 8 ( 10) DumpMT.


. . , , OBJECTREF . OBJECTREF . , OBJECTREF , . 9, str, OBJECTREF , MyString .

EEClass


EEClass CLR . , EEClass ( ) . , EEClass. ( , ) JIT EEClass, ( vtable ) .

For each type of application uploaded to the domain, one EEClass will be created. This includes interfaces, classes, abstract classes, arrays, and structures. Each EEClass is a tree node tracked by the execution engine. The CLR uses this network to navigate through EEClass structures for such purposes as class loading, method table building, type checking and type casting. The relationship of the child to the parent between the EEClass is established on the basis of the inheritance hierarchy, in turn, the relationship of the parent to the child is established based on the combination of the inheritance hierarchy and the class loading sequence. New EEClass nodes are added, relationships between nodes are superimposed, and new relationships are established during the execution of managed code. There are also horizontal connections with EEClass twins on the net.EEClass has three fields for managing node relationships between loaded types: ParentClass parent class, SiblingChain twin chain, and ChildrenChain child chain. See Figure 13 for a schematic representation of the EEClass in the context of the MyClass class from Figure 4.

Figure 13 shows only a few fields related to this discussion. Because we missed some fields in the view, we did not show the offset in this figure. EEClass has circular references to the method table. EEClass also points to method descriptor data blocks located in the default application access domain heap. A reference to the list of field descriptor objects located on the process heap provides information on the placement of fields during the construction of the method table. EEClass is located on a heap with a low frequency of access to the application domain, so that the operating system can more efficiently manage memory pages, and as a result, the workspace is reduced.


Figure 13 EEClass view

, 13 MyClass ( 3). EEClass SOS. 3 mc.Method1. EEClass MyClass Name2EE:

 !Name2EE C:/Working/test/ClrInternals/Sample1.exe MyClass MethodTable: 009552a0 EEClass: 02ca3508 Name: MyClass 

Name2EE , DumpDomain. EEClass, EEClass:

 !DumpClass 02ca3508 Class Name : MyClass, mdToken : 02000004, Parent Class : 02c4c3e4 ClassLoader : 00163ad8, Method Table : 009552a0, Vtable Slots : 8 Total Method Slots : a, NumInstanceFields: 0, NumStaticFields: 2,FieldDesc*: 00955224 MT Field Offset Type Attr Value Name 009552a0 4000001 2c CLASS static 00a8198c str 009552a0 4000002 30 System.UInt32 static aaaaaaaa ui 

13 DumpClass . (mdToken) MyClass PE , System.Object. ( 13) , Program.

MyClass vtable ( ). , Method1 Method2 , . .cctor .ctor , 10(0xA) . . MyClass . .

Conclusion


CLR. , , , . CLR .NET Framework. , , .

Source: https://habr.com/ru/post/263935/


All Articles