How I added support for Code Contracts for VS2015

In the past few weeks, I have been actively working on refining Code Contracts , fixing some nasty bugs and adding support for VS2015. And since VS2015 just saw the light, such support would be very helpful. Now about everything in order, and with a number of technical details.

So the first thing to know about Code Contracts is that this thing is alive. The code is publicly available on the githaba (https://github.com/Microsoft/CodeContracts) and there are a number of people who are actively involved in bringing order there. I am the owner of the repository, but I do it in my free time. In addition to me, there are a few other people who are cleaning up the Code Contracts Editor Extensions ( @sharwell ) and in some other areas.

Code Contracts can be divided into several components:

ccrewrite is a tool that deals with “rewriting” IL, removing statements (Contract.Requires / Ensures / Assert / Assume / if-throw) and replacing them with necessary calls to contract methods, depending on the configuration.
cccheck is a tool that deals with static analysis and formal proof at compile time that the program is correct.
Code Contracts Editor Extensions is an extension to VS that allows you to “see” contracts directly in the IDE.

There are a number of other things, for example, for generating documentation, as well as a plugin for ReSharper , which simplifies the addition of preconditions / postconditions and shows ccrewrite errors directly in the IDE.
')
I deal with two components - ccrewrite and a plugin, but now I want to focus on ccrewrite and on the difficulties that I face when adding support for VS2015.

Breaking changes in VS2015

The C # / VB compiler team did an amazing job developing new compilers from scratch. They added a bunch of expansion points and now they don’t need a PhD to write a fairly functional analyzer for the studio. But not without breaking changes.

For normal operation, ccrewrite must clearly know how the C # / VB language compiler works, and what a particular code translates into. Especially delivering blocks of iterators, asynchronous methods and closures, for which the C # / VB compilers do all sorts of different tricks. It becomes especially sad when the behavior of the compilers begins to change and the generated code becomes somewhat different.

The developers of the C # 6.0 compiler (aka Roslyn) introduced a number of optimizations to the generated IL code, which led to breakdowns of decompilers and ccrewrite.

Caching lambda expressions

You may have noticed strange static fields in the decompiled code that start with CS $ <> 9__ . These ~~souls are~~ caches of lambda expressions that do not capture the external context (lambda expressions that capture the external context lead to the generation of closures , and classes of the form <> c__DisplayClass1 are generated for them).

static void Foo() { Action action = () => Console.WriteLine("Hello, lambda!"); action(); }

In this case, the “old” compiler will generate the CS $ <> 9__CachedAnonymousMethodDelegatef field and initialize it in a lazy way:

 private static void <Foo>b__e() { Console.WriteLine("Hello, lambda!"); } private static Action CS$<>9__CachedAnonymousMethodDelegatef; static void Foo() { if (CS$<>9__CachedAnonymousMethodDelegatef == null) { CS$<>9__CachedAnonymousMethodDelegatef = new Action(<Foo>b__e); } Action CS$<>9__CachedAnonymousMethodDelegatef = CS$<>9__CachedAnonymousMethodDelegatef; CS$<>9__CachedAnonymousMethodDelegatef(); }

The C # 6.0 compiler takes a different approach. Experimental OS developers - Midori found out that invoking an instance method through a delegate is more efficient than invoking a static method. Therefore, the Roslyn compiler for the same lambda expression generates other code:

 private sealed class StaticClosure { public static readonly StaticClosure Instance = new StaticClosure(); public static Action CachedDelegate; //      internal void FooAnonymousMethodBody() { Console.WriteLine("Hello, lambda!"); } } static void Foo() { Action actionTmp; if ((actionTmp = StaticClosure.CachedDelegate) == null) { StaticClosure.CachedDelegate = new Action( StaticClosure.Instance.FooAnonymousMethodBody) actionTmp = StaticClosure.CachedDelegate; } Action action = actionTmp; action(); }

Now a “closure” is created - the StaticClosure class (real name <> c ) with a static field for storing the delegate - CachedDelegate ( <> 9__8_0 ) and a “singleton”. But now, the body of the anonymous method is in the instance method FooAnonymousMethodBody ( <Foo> b__8_0 ).

A simple test showed that calling the delegate through an instance method is indeed 10 percent faster, although in absolute units the difference is very, very small.

Now let's see when this change leads to problems in ccrewrite.
Assertions in Code Contracts are defined as calls to the Contract class methods, which somewhat complicates the assignment of contracts for interfaces and abstract classes. To circumvent this limitation, you need to create a special contract class marked with the ContractClassFor attribute. But this causes a number of additional difficulties.

 [ContractClass(typeof (IFooContract))] interface IFoo { void Boo(int[] data); } [ExcludeFromCodeCoverage,ContractClassFor(typeof (IFoo))] abstract class IFooContract : IFoo { void IFoo.Boo(int[] data) { Contract.Requires(Contract.ForAll(data, n => n == 42)); } } class Foo : IFoo { public void Boo(int[] data) { Console.WriteLine("Foo.Boo was called!"); } }

In this case, the Foo.Boo method contains no preconditions at all, and ccrewrite must first find the contract class ( IFooContracts ), “tear out” the contract from the IFooContracts.Boo method and transfer it to the Foo.Boo method. In the case of simple preconditions, it is not difficult to do this, but with closures, everything becomes more interesting.

Now, you need to find the internal IFooContracts. <> C class, copy it to the Foo class, copy the Contract.Requires call from the IFooContracts.Foo method and update IL so that it works with the new copy, and not with the original closure. In some cases, everything is even more fun: the presence of nested closures (several scopes, each of which has an exciting anonymous method) will require updating the nested classes in the correct order - from the most nested to the topmost one (that is why this logic is located here ).

Asynchronous method with two awaits

Another change in the new compiler is related to the generated code for asynchronous methods. The old compiler generated different code for the asynchronous method with one await operator and several await operators. The new compiler has a new optimization for asynchronous methods with two awaits , which also caused a lot of trouble.

Let's consider the following simple example:

 public async Task<int> FooAsync(string str) { Contract.Ensures(str != null); await Task.Delay(42); return 42; }

The C # compiler (pre-Roslyn) converts this code as follows:

A structure is created that implements I AsyncStateMachine and the entire logic of the method moves to the MoveNext method.
In the FooAsync method, the “ front ” logic remained: creating an instance of AsyncTaskMethodBuilder and initializing an instance of the state machine.

Here is what the generated code looks like:

 private struct FooAsync_StatemMachine : IAsyncStateMachine { //   FooAsync(string str) public string str; //    public int l__state; //      . //   TaskCompletionSource. public AsyncTaskMethodBuilder<int> t__builder; // ""    private TaskAwaiter u__taskAwaiter; public void MoveNext() { int num = this.l__state; int result; try { TaskAwaiter taskAwaiter = default(TaskAwaiter); if (num != 0) { //   //      Contract.Requires(this.str != null); taskAwaiter = Task.Delay(42).GetAwaiter(); //  :     //      "":  , //      if (!taskAwaiter.IsCompleted) { // l__state  0 ,    //     . this.l__state = 0; //  this AsyncTaskBuilder-,    //   ,      // t__bulder.AwaitUnsafeOnCompleted(..., this); return; } } //       ,  //   ,  . //  GetResult    ,  //     taskAwaiter.GetResult(); //    result = 42; } catch (Exception exception) { //     this.l__state = -2; this.t__builder.SetException(exception); return; } //    this.l__state = -2; this.t__builder.SetResult(result); } } public Task<int> FooAsync(string str) { var stateMachine = new FooAsync_StatemMachine { l__state = -1, t__builder = AsyncTaskMethodBuilder<int>.Create(), str = str, }; stateMachine.t__builder.Start(ref stateMachine); return stateMachine.t__builder.Task; }

There are quite a few letters, but the basic idea is this:

The precondition of the asynchronous method is inside the state machine. That is why ccrewrite should pull it out and transfer it to the FooAsync method. Otherwise, violation of the precondition will lead to the faulted task, and not to a “synchronous exception”.
There is a certain pattern as ccrewrite determines where the precondition is located. In the case of an asynchronous method with a single await operator, the original beginning of the method, and therefore the preconditions are immediately inside the if (num! = 0) condition. It is important!
The generated code depends on the number of await statements within the asynchronous method. With two or more await statements, the old compiler generates a finite state machine based on a switch, and ccrewrite handled this pattern correctly.

The C # 6.0 compiler generates the same code for the asynchronous method with one await, but a completely different code if there are two awaits.

NOTE
Another change to the C # 6.0 compiler: in Debug mode, the state machine generates a class, not a structure. This is done to support Edit and Continue.

If the fooAsync method change as follows:

 public async Task<int> FooAsyncOrig(string str) { Contract.Ensures(str != null); await Task.Delay(42); await Task.Delay(43); return 42; }

That C # 6.0 compiler, instead of generating a switch that is understandable to any decompiler and ccrewrite, will generate code that is very similar to code with one await operator, but with minor modifications:

 //   MoveNext if (num != 0) { // ccrewrite ,    ! if (num == 1) { taskAwaiter = this.u__taskAwaiter; this.u__taskAwaiter = default(TaskAwaiter); this.l__state = -1; goto OperationCompleted; } //    ! Contract.Requires(this.str != null); taskAwaiter = Task.Delay(42).GetAwaiter();

Since this is a new pattern, ccrewrite naively searched for contracts right inside the if condition (num! = 0) and considered the nested if statement as preconditions / postconditions. I had to teach him new tricks to handle this option correctly.

As a conclusion

Work at the IL-level is walking on thin ice. The search for patterns is rather complicated, the modification of the IL code is not intuitive, and even a simple task, like checking the postconditions in asynchronous methods, may require a lot of effort. In addition, many things are implementation details of the compiler and can vary from version to version. Here we have reviewed only a few examples, but this is not all the changes from the C # 6.0 compiler. At least , the IL code generated by using expression trees has changed a little, which also broke several test cases.

There are still a couple of nasty bugs that are being worked on. There is a problem with the Error List in VS2015 , and the postconditions in asynchronous methods apparently never worked normally. But, most importantly, the project is alive and, most likely, will be developed. So if you have any wishes, especially in the field of ccrewrite, write about it or get bugs on github !

Links

Source: https://habr.com/ru/post/263235/

All Articles