Note
The chapter published on Habré is not updated and it is possible that it is already somewhat outdated. So, please ask for a more recent text to the original:
CLR Book: GitHub, table of contents
CLR Book: GitHub, chapter
Release 0.5.2 of the book, PDF: GitHub Release
object
type, which is the base type and forms the structure for all reference types: ---------------------------------------------- | SyncBlkIndx | VMTPtr | Data | ---------------------------------------------- | 4 / 8 | 4 / 8 | 4 / 8 | ---------------------------------------------- | 0xFFF..FFF | 0xXXX..XXX | 0 | ---------------------------------------------- ^ | . .. , VMT Sum size = 12 (x86) .. 24 (x64)
VMTPtr
pointer. For the entire type system, this pointer is the most important: it is through it that both inheritance, and the implementation of interfaces and type conversion, and many other things work. This pointer is a reference to the .NET CLR type system.This is a version of CoreCLR. If you look at the structure of fields in the .NET Framework, then it will differ in the location of the fields.
// Low WORD is component size for array and string types (HasComponentSize() returns true). // Used for flags otherwise. DWORD m_dwFlags; // Base size of instance of this class when allocated on the heap DWORD m_BaseSize; WORD m_wFlags2; // Class token if it fits into 16-bits. If this is (WORD)-1, the class token is stored in the TokenOverflow optional member. WORD m_wToken; // <NICE> In the normal cases we shouldn't need a full word for each of these </NICE> WORD m_wNumVirtuals; WORD m_wNumInterfaces;
class Program { public static unsafe void Main() { Union x = new Union(); x.Reference.Value = "Hello!"; // , // VMT // - (IntPtr*)x.Value.Value - ( ) // - *(IntPtr*)x.Value.Value - VMT // - (void *)*(IntPtr*)x.Value.Value - void *vmt = (void *)*(IntPtr*)x.Value.Value; // VMT; Console.WriteLine((ulong)vmt); } [StructLayout(LayoutKind.Explicit)] public class Union { public Union() { Value = new Holder<IntPtr>(); Reference = new Holder<object>(); } [FieldOffset(0)] public Holder<IntPtr> Value; [FieldOffset(0)] public Holder<object> Reference; } public class Holder<T> { public T Value; } }
var vmt = typeof(string).TypeHandle.Value;
Why sizeof is for Value Type but not for Reference Type? In fact, the question is open because No one bothers to calculate the size of the reference type. The only thing you can stumble about is not the fixed size of two reference types: `Array` and` String`. As well as the `Generic` group, which depends entirely on specific options. Those. With the `sizeof (..)` operator, we couldn’t get by: you need to work with specific instances. However, no one bothers to make a method like `static int System.Object.SizeOf (object obj)`, which would easily and simply return to us what we need. So why didn't Microsoft implement this method? There is an idea that the .NET platform, in their understanding, is not the platform where the developer will be very worried about specific bytes. In which case, you can simply deliver the bar to the motherboard. Moreover, most of the data types that we implement do not occupy such large volumes. However, those who need everything they need will calculate all the sizes as it should. The latter, of course, is controversial.
unsafe int SizeOf(Type type) { MethodTable *pvmt = (MethodTable *)type.TypeHandle.Value.ToPointer(); return pvmt->Size; } [StructLayout(LayoutKind.Explicit)] public struct MethodTable { [FieldOffset(4)] public int Size; } class Sample { int x; } class GenericSample<T> { T fld; } // ... Console.WriteLine(SizeOf(typeof(Sample)));
Type or its definition | The size | Comment |
Object | 12 | SyncBlk + VMT + empty field |
Int16 | 12 | Boxed Int16: SyncBlk + VMT + Data (aligned by 4 bytes on x86) |
Int32 | 12 | Boxed Int32: SyncBlk + VMT + Data |
Int64 | sixteen | Boxed Int64: SyncBlk + VMT + Data |
Char | 12 | Boxed Char: SyncBlk + VMT + Data (aligned by 4 bytes on x86) |
Double | sixteen | Boxed Double: SyncBlk + VMT + Data |
IEnumerable | 0 | The interface has no size: you must take obj.GetType () |
List [T] | 24 | It doesn't matter how many items in List [T], to occupy it will be the same it stores data in an array that is not taken into account |
GenericSample [int] | 12 | As you can see, generics are beautifully considered. The size has not changed, because data is in the same place as boxed int. Total: SyncBlk + VMT + data = 12 bytes (x86) |
GenericSample [Int64] | sixteen | Similarly |
GenericSample [IEnumerable] | 12 | Similarly |
GenericSample [DateTime] | sixteen | Similarly |
string | 14 | This value will be returned for any string. because real size should be considered dynamically. However, it is suitable for the size of an empty line. Please note that the size is not aligned to bit depth: essentially this field is used should not |
int [] {1} | 24554 | For arrays in this place are very different data plus their size is not fixed because it must be considered separately |
// .NET Framework 4 ------------------------------------------------------------------------- | SyncBlkIndx | VMTPtr | Length | char | char | Term | ------------------------------------------------------------------------- | 4 / 8 | 4 / 8 | 4 | 2 . | 2 . | 2 . | ------------------------------------------------------------------------- | -1 | 0xXXXXXXXX | 2 | a | b | nil | ------------------------------------------------------------------------- Term - null terminator Sum size = (12 (24) + 2 + (Len*2)) -> . (20 ) // .NET Framework 3.5 ------------------------------------------------------------------------------ | SyncBlkIndx| VMTPtr | ArrayLength | Length | char | char | Term | ------------------------------------------------------------------------------ | 4 / 8 | 4 / 8 | 4 | 4 | 2 . | 2 . | 2 . | ------------------------------------------------------------------------------ | -1 | 0xXXXXXXXX | 3 | 2 | a | b | nil | ------------------------------------------------------------------------------ Term - null terminator Sum size = (16 (32) + 2 + (Len*2)) -> . (24 )
unsafe int SizeOf(object obj) { var majorNetVersion = Environment.Version.Major; var type = obj.GetType(); var href = Union.GetRef(obj).ToInt64(); var DWORD = sizeof(IntPtr); var baseSize = 3 * DWORD; if (type == typeof(string)) { if (majorNetVersion >= 4) { var length = (int)*(int*)(href + DWORD /* skip vmt */); return DWORD * ((baseSize + 2 + 2 * length + (DWORD-1)) / DWORD); } else { // on 1.0 -> 3.5 string have additional RealLength field var arrlength = *(int*)(href + DWORD /* skip vmt */); var length = *(int*)(href + DWORD /* skip vmt */ + 4 /* skip length */); return DWORD * ((baseSize + 2 + 2 * length + (DWORD -1)) / DWORD); } } else if (type.BaseType == typeof(Array) || type == typeof(Array)) { return ((ArrayInfo*)href)->SizeOf(); } return SizeOf(type); }
Action<string> stringWriter = (arg) => { Console.WriteLine($"Length of `{arg}` string: {SizeOf(arg)}"); }; stringWriter("a"); stringWriter("ab"); stringWriter("abc"); stringWriter("abcd"); stringWriter("abcde"); stringWriter("abcdef"); } ----- Length of `a` string: 16 Length of `ab` string: 20 Length of `abc` string: 20 Length of `abcd` string: 24 Length of `abcde` string: 24 Length of `abcdef` string: 28
// -------------------------------------------------------------------------------- | SBI | VMTPtr |Total | Len_1 | Len_2 | .. | Len_N | Term | VMT_Child | --------------------------opt-------opt------------opt-------opt--------opt----- | 4 / 8 | 4 / 8 | 4 | 4 | 4 | | 4 | 4 | 4/8 | -------------------------------------------------------------------------------- |0xFF.FF|0xXX.XX | ? | ? | ? | | ? |0x00.00| 0xXX..XX | -------------------------------------------------------------------------------- - opt: - SBI: Sync Block Index - VMT_Child: - Total: . - Len_2..Len_N + Term: 1 ( VMT->Flags)
public int SizeOf() { var total = 0; int elementsize; fixed (void* entity = &MethodTable) { var arr = Union.GetObj<Array>((IntPtr)entity); var elementType = arr.GetType().GetElementType(); if (elementType.IsValueType) { var typecode = Type.GetTypeCode(elementType); switch (typecode) { case TypeCode.Byte: case TypeCode.SByte: case TypeCode.Boolean: elementsize = 1; break; case TypeCode.Int16: case TypeCode.UInt16: case TypeCode.Char: elementsize = 2; break; case TypeCode.Int32: case TypeCode.UInt32: case TypeCode.Single: elementsize = 4; break; case TypeCode.Int64: case TypeCode.UInt64: case TypeCode.Double: elementsize = 8; break; case TypeCode.Decimal: elementsize = 12; break; default: var info = (MethodTable*)elementType.TypeHandle.Value; elementsize = info->Size - 2 * sizeof(IntPtr); // sync blk + vmt ptr break; } } else { elementsize = IntPtr.Size; } // Header total += 3 * sizeof(IntPtr); // sync blk + vmt ptr + total length total += elementType.IsValueType ? 0 : sizeof(IntPtr); // MethodsTable for refTypes total += IsMultidimentional ? Dimensions * sizeof(int) : 0; } // Contents total += (int)TotalLength * elementsize; // align size to IntPtr if ((total % sizeof(IntPtr)) != 0) { total += sizeof(IntPtr) - total % (sizeof(IntPtr)); } return total; }
Console.WriteLine($"size of int[]{{1,2}}: {SizeOf(new int[2])}"); Console.WriteLine($"size of int[2,1]{{1,2}}: {SizeOf(new int[1,2])}"); Console.WriteLine($"size of int[2,3,4,5]{{...}}: {SizeOf(new int[2, 3, 4, 5])}"); --- size of int[]{1,2}: 20 size of int[2,1]{1,2}: 32 size of int[2,3,4,5]{...}: 512
// Low WORD is component size for array and string types (HasComponentSize() returns true). // Used for flags otherwise. DWORD m_dwFlags; // Base size of instance of this class when allocated on the heap DWORD m_BaseSize; WORD m_wFlags2; // Class token if it fits into 16-bits. If this is (WORD)-1, the class token is stored in the TokenOverflow optional member. WORD m_wToken; // <NICE> In the normal cases we shouldn't need a full word for each of these </NICE> WORD m_wNumVirtuals; WORD m_wNumInterfaces;
public class Sample { public int _x; public void ChangeTo(int newValue) { _x = newValue; } public virtual GetValue() { return _x; } } public class OverridedSample : Sample { public override GetValue() { return 666; } }
public class Sample { public int _x; public static void ChangeTo(Sample self, int newValue) { self._x = newValue; } // ... } // struct public struct Sample { public int _x; public static void ChangeTo(ref Sample self, int newValue) { self._x = newValue; } // ... }
I will explain in advance why all explanations around inheritance are built around examples on static methods: in fact, all methods are static. And the copy and no. There is no instance of compiled methods for each class instance. It would take a huge amount of memory: it's easier for the same method to pass a reference every time to an instance of the structure or class with which it works.
void Main() { var sample = new Sample(); var overrided = new OverridedSample(); Console.WriteLine(sample.Virtuals[Sample.GetValuePosition].DynamicInvoke(sample)); Console.WriteLine(overrided.Virtuals[Sample.GetValuePosition].DynamicInvoke(sample)); } public class Sample { public const int GetValuePosition = 0; public Delegate[] Virtuals; public int _x; public Sample() { Virtuals = new Delegate[1] { new Func<Sample, int>(GetValue) }; } public static void ChangeTo(Sample self, int newValue) { self._x = newValue; } public static int GetValue(Sample self) { return self._x; } } public class OverridedSample : Sample { public OverridedSample() : base() { Virtuals[0] = new Func<Sample, int>(GetValue); } public static new int GetValue(Sample self) { return 666; } }
Link to the whole book
CLR Book: GitHub
Release 0.5.0 books, PDF: GitHub Release
Source: https://habr.com/ru/post/344556/
All Articles