Surely almost everyone who dealt with C #, is known for a similar construction:
int[] ints = new int[3] { 1,2,3 };
It was quite logical to expect that this construction would turn into something like this:
int[] ints = new int[3]; ints[0] = 1; ints[1] = 2; ints[2] = 3;
Alas, in fact, the nut is much more wrinkled than it seems at first glance, and there are some subtleties that will be indicated later. And until then, put on a worn “IL freak” jersey (who has one) and dive into the depths of implementation.
Ultimately, the first construction will turn the compiler into such a squiggle:
')

What a miracle in my garden? What is this
blablabla bullshit <PrivateImplementationDetails> ? Before I explain what's what, let's take a look at
Q ::
Main , where I indicated a value at the top of the stack before each line of code:
.method private hidebysig static void Main() cil managed { .entrypoint // Code size 20 (0x14) .maxstack 3 .locals init (int32[] V_0) IL_0000: nop // {} IL_0001: ldc.i4.3 // {3} IL_0002: newarr [mscorlib]System.Int32 // {&int[3]} IL_0007: dup // {&int[3], &int[3]} IL_0008: ldtoken field valuetype '<PrivateImplementationDetails>{8C802ECE-B24C-4A20-AE34-9303FE2DD066}'/'__StaticArrayInitTypeSize=12' '<PrivateImplementationDetails>{8C802ECE-B24C-4A20-AE34-9303FE2DD066}'::'$$method0x6000001-1' // {&int[3], &int[3], #'$$method0x6000001-1'} IL_000d: call void [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class [mscorlib]System.Array, valuetype [mscorlib]System.RuntimeFieldHandle) // {&int[3]} IL_0012: stloc.0 // {} IL_0013: ret } // end of method Q::Main
Let's now perform line-by-line analysis:
IL_0001 and
IL_0002 - a new array of type
System.Int32 and dimension 3 is created.
On
IL_0007, we come across the first surprise in the form of a duplicate array reference. Why? Suppose that the
array is initialized on
IL_0008 and
IL_0009 (very soon we will return to this place). And now let's look at
IL_0012 , where the value at the top of the stack — again the array — is assigned to a local variable with index 0, i.e. variable
ints . What if we assign the value of the
ints variable to
IL_0007 ? And this is what will happen:
ldc.i4.3 newarr [mscorlib]System.Int32 stloc.0 // ldloc.0 // ldtoken field valuetype '<PrivateImplementationDetails>{8C802ECE-B24C-4A20-AE34-9303FE2DD066}'/'__StaticArrayInitTypeSize=12' '<PrivateImplementationDetails>{8C802ECE-B24C-4A20-AE34-9303FE2DD066}'::'$$method0x6000001-1' call void [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class [mscorlib]System.Array, valuetype [mscorlib]System.RuntimeFieldHandle)
The assignment will no longer be atomic: from now on, the external observer will notice the array in the uninitialized state, without elements. This is exactly what the
IL_0008 and
IL_0009 lines
do . So The code given at the very beginning is
not equivalent to the construction:
nt[] ints = new int[3]; ints[0] = 1; ints[1] = 2; ints[2] = 3;
But rather is something like this:
int[] t = new int[3]; t[0] = 1; t[1] = 2; t[2] = 3; int[] ints = t;
Although the implementation avoids creating two local variables. This moves us to two abstruse lines of code:
IL_0008: ldtoken field valuetype '<PrivateImplementationDetails>{8C802ECE-B24C-4A20-AE34-9303FE2DD066}'/'__StaticArrayInitTypeSize=12' '<PrivateImplementationDetails>{8C802ECE-B24C-4A20-AE34-9303FE2DD066}'::'$$method0x6000001-1' IL_000d: call void [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class [mscorlib]System.Array, valuetype [mscorlib]System.RuntimeFieldHandle)
But there is nothing difficult and / or terrible about it. In essence, we observe a call to
RuntimeHelpers.InitializeArray , which fills the field, the token of which is
pushed onto the stack on
IL_0008 , with an array referenced at the top of the stack after executing
IL_0007 . The value of the token corresponds to the picture below:

In fact, the leased line is a static field in a private and, obviously, generated by the compiler, a class with an obviously unpronounceable name. A couple of points to pay attention to. First, this class has a nested class called
__StaticArrayInitTypeSize = 12 . It is an array of actual size of
12 bytes (4 bytes for each
System.Int32 element, the size of each is 4 bytes, a total of 12). Secondly, it should be noted that the type inherits
System.ValueType (I seriously hope that readers are familiar with the fate of instances of significant types after they are created on the stack, so let's not get stuck on it -
note the author. ). But how does the type get those 12 bytes? Obviously, just slipping the name is not enough for clr to allocate the necessary amount of memory, so if you look at the implementation through ILDASM you will see this:
.class private auto ansi '<PrivateImplementationDetails>{8C802ECE-B24C-4A20-AE34-9303FE2DD066}' extends [mscorlib]System.Object { .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = ( 01 00 00 00 ) .class explicit ansi sealed nested private '__StaticArrayInitTypeSize=12' extends [mscorlib]System.ValueType { .pack 1 .size 12 // } // end of class '__StaticArrayInitTypeSize=12' .field static assembly valuetype '<PrivateImplementationDetails>{8C802ECE-B24C-4A20-AE34-9303FE2DD066}'/'__StaticArrayInitTypeSize=12' '$$method0x6000001-1' at I_00002050 } // end of class '<PrivateImplementationDetails>{8C802ECE-B24C-4A20-AE34-9303FE2DD066}'
The
.size directive tells both us and clr that it is necessary to allocate a block of memory of 12 bytes at the time of the creation of an instance of this type. If you are curious about the role of the
.pack directive, then the essence is simple: this directive indicates alignment with the specified power of two (only values from 2 to 128 are supported (obviously, there is no alignment with 1 meaning -
approx. Trans )). Required for COM compatibility. Let's return to the field:
.field static assembly valuetype '<PrivateImplementationDetails>{8C802ECE-B24C-4A20-AE34-9303FE2DD066}'/'__StaticArrayInitTypeSize=12' '$$method0x6000001-1' at I_00002050
The type is rather simple, despite the fact that the name is rather long due to the type nesting. In our case,
$$ method0x6000001-1 is the name of the field. But the fun begins after the "
at ". This is the so-called.
The data label , which, in turn, is a piece of data somewhere in the PE file at a given offset. Directly in ILADSM you will see something like this:
.data cil I_00002050 = bytearray ( 01 00 00 00 02 00 00 00 03 00 00 00)
This is the
data label advertisement, which is, as already seen, the byte sequence of the final array in little-endian. Now we need to understand how
InitailizeArray works:
call void [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class [mscorlib]System.Array, valuetype [mscorlib]System.RuntimeFieldHandle)
An instance of the array is transmitted (we have already created it with the commands
IL_0001 ,
IL_0002 ) and a pointer to the field specified after the keyword "
at ", into which the array data is wrapped. So runtime is able to calculate the required number of bytes to read at a given address, thus constructing an array. In turn, the meaning of the value
I_0000 2050 does not constitute any mystery - it is the most ordinary
RVA . You can verify this using dumpbin:

But there is an equally interesting detail: the compiler re-
uses the type
__StaticArrayInitTypeSize when arrays take up the same amount of memory space. So listing:
int[] ints = { 1, 2, 3, 4, 5, 6, 7, 8 }; long[] longs = { 1, 2, 3, 4 }; byte[] bytes = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 };
Makes the compiler use the same type, because all arrays in memory occupy 32 bytes each:
.field static assembly valuetype '<PrivateImplementationDetails>{AA6C9D77-5FAD-47E0-8B55-1D8739074F1F}'/'__StaticArrayInitTypeSize=32' '$$method0x6000001-1' at I_00002050 .field static assembly valuetype '<PrivateImplementationDetails>{AA6C9D77-5FAD-47E0-8B55-1D8739074F1F}'/'__StaticArrayInitTypeSize=32' '$$method0x6000001-2' at I_00002070 .field static assembly valuetype '<PrivateImplementationDetails>{AA6C9D77-5FAD-47E0-8B55-1D8739074F1F}'/'__StaticArrayInitTypeSize=32' '$$method0x6000001-3' at I_00002090 .data cil I_00002050 = bytearray ( 01 00 00 00 02 00 00 00 03 00 00 00 04 00 00 00 05 00 00 00 06 00 00 00 07 00 00 00 08 00 00 00) .data cil I_00002070 = bytearray ( 01 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00) .data cil I_00002090 = bytearray ( 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20)
Also, for arrays with sizes of 1 and 2
elements , this IL code will generate:
.method private hidebysig static void Main() cil managed { .entrypoint // Code size 19 (0x13) .maxstack 3 .locals init (int32[] V_0, int32[] V_1) IL_0000: nop IL_0001: ldc.i4.2 IL_0002: newarr [mscorlib]System.Int32 IL_0007: stloc.1 // // V_1[0] = 1 // IL_0008: ldloc.1 IL_0009: ldc.i4.0 IL_000a: ldc.i4.1 IL_000b: stelem.i4 // // V_1[1] = 2 // IL_000c: ldloc.1 IL_000d: ldc.i4.1 IL_000e: ldc.i4.2 IL_000f: stelem.i4 // // V_0 = V_1 // IL_0010: ldloc.1 IL_0011: stloc.0 IL_0012: ret } // end of method Q::Main
And here, in fact, that very focus with two local variables: one of them is temporary, into which values are placed as the array is filled, after which the reference to the array is transferred to the main variable. The reasons for this approach (with a separate method for filling the array) are obvious: in the case of a naive implementation, we would have 4 commands for each element, which would increase the amount of array building code linearly in proportion to the size of the array, instead the code size is constant.
ps The article describes the behavior of the compilers C # 2.0 and 3.0 versions from Microsoft. The behavior of code generated by compilers of other versions or third-party compilers (for example, Mono) may differ from the one given in the article.