Features of the CLR in the .NET framework

Starting to learn C # and the .NEt Framework, I couldn’t understand how the CLR works. I either found huge articles that couldn’t be mastered in 1 evening or too short, rather even a confusing description of the process (as in G. Shildt’s book).
Some time ago I decided that it would be nice to collect knowledge obtained from books, “features” and often used techniques in one place. And then the new information quickly settles in the head, but it is also quickly forgotten, and after a few weeks it is necessary to rummage again in hundreds and thousands of lines of text to find the answer to the question. Reading the next book on programming, I made brief notes of the most important thing that seemed to me. Sometimes I described some process in a language I could understand with an example invented, etc. I do not claim the absolute correctness of the stated material. This is just my understanding of the process, with my examples and information that I considered key to understanding. Having worked through some material, I decided to keep it for all those who might find this useful. And who is already familiar - he will just refresh it in memory.

It should be noted that the concept of "type" is some kind of class in C #. But since Since .NET supports not only C # but other languages, the concept of “type” is used, and not the usual “class”. Also, this article assumes that the reader is already familiar with the features of .Net and reveals the features of specific things and processes.

As an example, I will give the text of the program that displays the age of the object:
source code of the program so that it is clear:
using System; namespace ConsoleApplication_Test_Csharp { public class SomeClass { int age; public int GetAge() { age = 22; return age; } } public sealed class Program { public static void Main() { System. Console .Write( "My age is " ); SomeClass me = new SomeClass(); int myAge; myAge = me.GetAge(); System. Console .WriteLine(myAge); Console .ReadLine(); } } } * This source code was highlighted with Source Code Highlighter .

And so proceed:
')
What is CLR?

CLR (Common language runtime) - a common language runtime. It provides language integration and allows objects, thanks to the standard set of types and metadata) created in one language, to be “equal citizens” of code written on another.

In other words, the CLR is the same mechanism that allows the program to execute in the order we need, calling functions, managing data. And all this for different languages (c #, VisualBasic, Fortran). Yes, the CLR really controls the process of executing commands (machine code, if you wish) and decides which piece of code (function) from where to get and where to substitute right at the moment the program is running. The compilation process is shown in the figure:

IL (Intermediate Language) - a code in a special language that resembles an assembler, but written for .NET. The code from other top-level languages (c #, VisualBasic) is converted to it. It is then that dependence on the selected language disappears. After all, everything is transformed into IL (although there are reservations that correspond to the common language specification CLS, which is not included in the scope of this article)
Here's what it looks like for the SomeClass :: GetAge () function

The compiler, in addition to the IL assembler, creates complete metadata.

Metadata is a set of data tables that describe what is defined in the module. There are also tables indicating what the managed module refers to (for example, imported types and numbers). They extend the capabilities of technologies such as type libraries and interface description language (IDL) files. Metadata is always associated with a file with IL code, in fact they are embedded in * .exe or * .dll.
Thus, metadata is a table in which there are fields that indicate that such and such a method is in such and such a file and belongs to such a type (class).
Here is what the metadata for my example looks like (metadata tables are simply converted into a clear view using the ILdasm.exe disassembler. In fact, this is part of the * .exe program file:

TypeDef is an entry for each type defined in the module.
For example, TypeDef # 1 describes the class SomeClass and shows Field Field # 1 with the name Field Name: age, method MethodName: GetAge, and constructor MethodName: .ctor. The TypeDef # 2 entry describes the Program class.

Having dealt with the basic concepts, let's see what the same managed module consists of (or just our ConsoleApplication_Test_Csharp.exe file, which displays the object's age on the screen):

The header shows on which type of processor the program will run. PE32 (for 32 and 64 bit OS) or PE32 + (only for 64 bit OS)
CLR header - contains information that turns this module into a managed one (flags, CLR version, Main () entry points)
Metadata - 2 types of metadata tables:
1) types and members defined in the source code
2) types and members having references in the source code.
IL Code - Code generated by the compiler when compiling C # code. Then IL is converted to processor commands (0001 0011 1101 ...) using CLR (or more precisely JIT)

JIT job

And so, what happens when the program starts for the first time ?
First, the header is analyzed to find out which process to start (32 or 64 bit). Then, the selected version of the MSCorEE.dll file is loaded ( C: \ Windows \ System32 \ MSCorEE.dll for 32-bit processors)
After that, the method located MSCorEE.dll is called, which initializes the CLR, the assemblies and the entry point of the Main () function of our program.

static void Main() { System. Console .WriteLine( "Hello " ); System. Console .WriteLine( "Goodbye" ); } * This source code was highlighted with Source Code Highlighter .

To perform any method, for example System.Console.WriteLine (“Hello„), IL must be converted to machine instructions (the very zeros and ones) This is done by Jiter or just-in-time compiler.

First, before running Main (), the CLR finds all declared types (for example, the Console type).
It then defines the methods, combining them into records within a single “structure” (one method defined in the Console type).
Records contain addresses by which method implementations can be found (i.e., the transformations that the method performs).

At the first call to the WriteLine function, the JiT-compiler is called.
JiTer 'y knows the called method and the type by which this method is defined.
JiTer searches for the corresponding assembly in the metadata — the implementation of the method code (the implementation code of the WriteLine method (string str)).
Then, it checks and compiles IL into machine code (own commands), storing it in dynamic memory.
After JIT Compiler returns to the internal "structure" of data type (Console) and replaces the address of the called method, the address of the memory block with executable processor commands.
After this, the Main () method calls WriteLine (string str) again. Because the code is already compiled, the appeal is made bypassing the JiT Compiler. After executing the WriteLine (string str) method, control is returned to the Main () method.

It follows from the description that the function “slowly” works only at the moment of the first call, when the JIT translates the IL code into processor instructions. In all other cases, the code is already in memory and is substituted as optimized for this processor. However, if another program is started in another process, Jiter will be called again for the same method. For applications running in the x86 environment, JIT generates 32-bit instructions, and in x64 or IA64 environments, 64-bit instructions, respectively.

Code optimization Managed and Unmanaged Code

IL can be optimized, i.e. IL - NOP commands (empty command) will be removed from it. To do this, you need to add parameters when compiling

Debug version is built with parameters: / optimize -, / debug: full
The release version is built with the parameters: / optimize +, / debug: pdbonly

What is the difference between managed code and unmanaged?

Unmanaged code is compiled for a specific processor and simply executed when called.

In a managed environment, the compilation is done in 2 steps:

1) the compiler translates C # code to IL
2) for execution, you need to translate the IL code into the machine code of the processor, which requires additional. dynamic memory and time (just the same work JIT).

Interaction with unmanaged code:

- managed code can call a guided function from a DLL via P / Invoke (for example, CreateSemaphore from Kernel32.dll).
- managed code can use existing COM-component (server).
- unmanaged code can use a managed type (server). You can implement COM - components in a managed environment and then you do not need to keep counting interface references.

The / clr option allows you to compile Visual C ++ code into IL-driven methods (except when containing assembled inserts (__asm), a variable number of arguments, or embedded procedures (__enable, _RetrurAddress)). If this does not work out, the code will be compiled into standard x86 commands. The data in the case of IL code is not managed (metadata is not created) and is not tracked by the garbage collector (this concerns C ++ code).

Type system

In addition, I want to talk about the CTS type system adopted by Microsoft.

CTS (Common Type System) is a common type system in the CLR (the type, apparently, is an analogue of the C # class). This is a standard recognized by ECMA which describes the definition of types and their behavior. It also determines the rules of inheritance, virtual methods, the lifetime of objects. After registration, the ECMA standard is called CLI (Common Language Infrastructure)

- CTS supports only single inheritance (as opposed to C ++)
- All types are inherited from System.Object (Object - the name of the type, the root of all other types, System - the namespace)

According to the CTS specification, any type contains 0 or more members.

Core members:

The field is a variable, part of the state of the object. Identified by name and type.
Method - a function that performs an action on an object. It has a name, a signature (the number of parameters, a sequence, types of parameters, the return value of a function), and modifiers.
The property in the implementation looks like a method (get / set) and for the caller as a field (=). Properties allow the type in which they are implemented to check the input parameters and the state of the object.
Event - provides a mechanism for mutual notification of objects.

Access modifiers:

Public - the method is available to any code from any assembly.
Private - methods are called only inside the type.
Family (protected) - the method is called by derived types, regardless of the assembly
Assembly (internal) - the method is called by any code from the same assembly
Family or Assembly
(protected internal) - the method is called by derived types from any assembly and + by any types from the same assembly.

CLS (Common Language Specification) - specifications released by Microsoft. It describes the minimum set of features that compiler manufacturers must implement in order for their products to work in the CLR. CLR / CTS supports more features defined by CLS. IL assembler supports the full range of CLR / CTS features. Languages (C #, Visual Basic) supports some features of the CLR / CTS (including at least CLS).
The example in the picture

CLS Compliance Example

The [assembly: CLSCompliant (true)] attribute causes the compiler to detect any externally accessible types that contain constructs that are not allowed in other languages.

using System;
[assembly: CLSCompliant ( true )]
namespace SomeLibrary
{
// a warning occurs because the type is open
public sealed class SomeLibraryType
{
// type returned by the function does not match CLS
public UInt32 Abc () { return 0; }
// identifier abc () is different from the previous one only if
// the correspondence is not maintained
public void abc () {}
// no error, closed method
private UInt32 ABC () { return 0; }
}
}
* This source code was highlighted with Source Code Highlighter .

First warning: UInt32 Abc () returns an unsigned integer. Visaul Basic, for example, does not work with such values.
The second warning: the two open methods Abc () and abc () are single and differ only in the case of letters and the return type. VisualBasic cannot call both methods.

Removing public and leaving only the sealed class SomeLibraryType both warnings will disappear. Since SomeLibraryType will be internal by default and will not be visible from the outside of the assembly.

PS The article is based on materials from the book of J. Richter “CLR via C #. Programming on the Microsoft .NET Framework 2.0 in C #

Source: https://habr.com/ru/post/90426/

All Articles

Features of the CLR in the .NET framework

What is CLR?

JIT job

Code optimization Managed and Unmanaged Code

Type system

More articles: