The problem of cyclic dependencies during type initialization

Some of the readers who have ever encountered the problem described in the title of the article probably remained at work until late and spent many hours in the debugger. For others, this may be nothing more than a play on words and slang words. However, let's step aside from the jargon and reveal the concepts:

Type Initialization: This is the code that is executed to initialize all static class variables and execute a static constructor;
Cyclic dependency: two pieces of code that depend on each other. In our case, these are two classes, the initialization of types of which requires an already initialized type of another class.

Well, a small example to show what is at stake:

using System; class Test {         static void Main()    {        Console.WriteLine(First.Beta);    } } class First {     public static readonly int Alpha = 5;     public static readonly int Beta = Second.Gamma; } class Second {     public static readonly int Gamma = First.Alpha; }

The result of this code will be 0

Of course, if you do not look in the specification, then any expectations as it will work no more than assumptions. Therefore, we will look at the specification (section 10.5.5.1 of the C # 4 version):

It’s not a problem. If a static constructor (§10.12) exists, the execution of the static field initializers occurs immediately prior to executing that static constructor. Otherwise, the field of initialization of the static field of the field of implementation is dependent on the time of use.

Transfer:

The order of initialization of static fields of the class corresponds to the order of their location in the source text of the class. If there is a static constructor in the class, the initialization code for the static fields of the class is located right before the static constructor call. Otherwise, if the static constructor does not exist, the initialization of the static fields is performed in a location dependent on the specific implementation: this happens before the first use of the static field.

')
In addition to the language specification, an excerpt from the CLI specification, which reveals more details on type initialization, especially cyclic dependencies and multi-threading, can be cited. However, I will not do this, but only write a couple of short excerpts:

Guaranteed thread safety during type initialization
If CLI notices that type A needs to be initialized while it is also in the process of initialization in the same thread, CLI continues to work as if type A has already been initialized.

So, what would happen in your opinion:

Initialize Test : no further action required
Start performing Main
Start Initialization First (since we need First.Beta )
Set First.Alpha to 5
Start the initialization of Second (since we need Second.Gamma )
Install Second.Gamma to First.Alpha (5)
Finish Second Initialization
Install First.Beta to Second.Gamma (5)
Finish Initialization First
Type "5"

And here is described what is happening in reality - on my computer, with the .Net Framework 4.5 beta installed (I know that type initialization was changed in .NET 4. I don’t know if there were changes in .Net 4.5, but I don’t I will argue that it is not possible)

Initialize Test : no further action required
Start performing Main
Start Initialization First (since we need First.Beta )
Begin to initiate Second (we will need Second.Gamma )
Set Second.Gamma to First.Alpha (0)
Finish Second Initialization
Set First.Alpha to 5
Install First.Beta to Second.Gamma (0)
Finish Initialization First
Type 0

Step (5) is very interesting. We know that we need to initialize First order to get further First.Alpha . However, this thread is already initializing First , so we skip initialization, hoping that everything is in order. However, at this point, the initialization of the variable has not yet occurred. Oops ...

(There is one subtlety that will avoid all the problems described: the use of the keyword const)

Back to the real world

I hope my example has clarified for you why the use of cyclic dependencies during type initialization is a matter that will spoil your life a lot. Such places are very difficult to catch and debug. And in fact this is a classic Heisenbag . In our example, it is important to understand that if it so happens that the program initializes the first Second (for example, to access another variable), then we will get a completely different result. And, in practice, you can get a situation where the launch of all unit tests will lead to the fact that all of them will be overwhelmed. But if at the same time run them separately, they will work (it is quite possible, except for one).

One way to avoid such situations is to refuse to initialize types altogether. In most cases, this is exactly what you need. However, we usually use well-known things. Such as Encoding.Utf8 , or TimeZoneInfo.Utc . Notice that in both cases these are static properties, but it seems to me that they carry with them the use of static fields. At first glance, it seems that using public static readonly public static get-only properties is the same, however, as we will see later, using properties gives its advantages.

My library Noda Time has several similarities to ours. And all because many types of this library are immutable , i.e. unchangeable. It makes sense when you need to create your own UTC time zone, or the ISO calendar system . Moreover, in addition to publicly visible values, we have a lot of static variables used inside the library (mainly for caching tasks). All this makes the library more difficult and difficult to test, but the performance benefits in this case are very, very significant.

Unfortunately, a huge number of these fields and properties have cyclical dependencies. As I mentioned earlier, when we add a new static field, this can lead to the most various breakdowns in the program. I can fix the immediate cause, but that leaves me with a sense of concern about the integrity of the code. After all, if I have eliminated one problem, it does not give any guarantees that there are no others.

Type Initialization Testing

One of the main issues with type initialization is sensitivity to the order of initialization in combination with the guarantee that the type within the AppDomain will be initialized only once. As I showed earlier, it is possible that with one initialization order this will cause an error, and with any other, no error will occur.

For myself, I decided that when developing Noda Time, I want to be absolutely sure that cyclic dependencies will not create any problems for me. So I want to make sure that the initialization of types does not form loops, regardless of the order in which they are initialized. Logically, we can define a cyclical dependency that starts with one type, starting with other types that are in the same cycle. I am very anxious not to miss any extreme cases, and to go through all the options that are possible and not to let anything out of sight. Because I used the method of brute force - a complete bust.

Here is our rough plan:

We start with an empty list of dependencies;
For each type of target assembly:
- Create New AppDomain
- Download assembly there
- Initialize a type (perform an action on it to start the initialization process)
- View the stack trace from the beginning of each type initialization and record all dependencies.
View circular dependencies in the final list

Please note that we will never have a situation where we can determine cyclical dependency in one load of the application domain. To do this, it is necessary to bypass all types and identify cycles, analyzing the results.

The description of how the code works will be much larger than the code itself and in fact it is very easy to understand, so I will place it at the end of the article.

This solution is not very good for several reasons:

Creating a new AppDomain and loading assemblies from a unit testing program into it may not be as simple as it could be. My code does not work correctly in conjunction with NCrunch . And I'm sure that if I fix this, the rest of the unit-testing systems will still break my program.

It is based on the fact that each type initializer will contain the necessary line of code for the system to work:

 private static readonly int TypeInitializationChecking = NodaTime.Utility.TypeInitializationChecker.RecordInitializationStart();

It’s bad not only that you need to add a line of code to every type that interests us. This is bad because this line will be called every time the type is initialized. It will also select at least 4 bytes from the heap, which is very bad if the program is not running in test mode. Of course, I could use preprocessor directives to remove this code from the version, not for testing. But from this the code will look even dirtier;
This method finds circular dependencies only for those versions of .Net on which tests were run. If we consider that there are differences in different versions of the .Net Framework, I would not be sure that the tests will cover 100% of the situations. Similarly, if we change the current CultureInfo, or any other seemingly permanent, environment variable, tests can work in a completely different way.
Also in this implementation, I do not look at situations where the code is multithreaded. For such situations, I am again not sure that this will work correctly.

And, considering all these reservations ... Is it worth using? Definitely, yes. This technique helped me find many bugs that were fixed.

Fix cyclic dependencies

In the past, I “fixed” the type initialization order simply by moving the field code. Cycles still existed, but I figured out how to make them harmless. I can say that this approach is not scalable and costs much more effort than it seems. The code becomes difficult ... And if you once get a cycle in more than two dependencies, it will be a problem for the mind how to make it safe. At the moment I use a very simple technique to implement the deferred initialization of static variables.

So instead of looking for what the static readonly field creates for you a cyclic dependency, you use the static readonly property , which returns an internal static readonly field , in a nested, private static class. We still have thread-safe initialization with a single call guarantee, but the nested type will not be initialized until there is a need for it.
So instead:

  // Requires Bar to be initialized - if Bar also requires Foo to be // initialized, we have a problem... public static readonly Foo SimpleFoo = new Foo(Bar.Zero);

We will write:

 public static readonly Foo SimpleFoo { get { return Constants.SimpleFoo; } } private static class Constants {    private static readonly int TypeInitializationChecking = NodaTime.Utility.TypeInitializationChecker.RecordInitializationStart();     // This requires both Foo and Bar to be initialized, but that's okay    // so long as neither of them require Foo.Constants to be initialized.    // (The unit test would spot that.)    internal static readonly Foo SimpleFoo = new Foo(Bar.Zero); }

At the moment I cannot determine whether to include static constructors in these classes in order to achieve lazy initialization or not. If an initializer of type Foo calls an initializer of type Foo.Constants, we will return to the starting point. But adding static constructors in each of the nested classes sounds awful.

Conclusion

I want to tell you that some part of me in reality does not like writing test code or doing workarounds and crutches. And it is definitely worth considering whether it is possible to actually get rid of the initialization of types (or part of it), avoiding storage only in static fields. It would be very nice if you could find all these dependencies avoiding running a program or unit tests. So that it can be done using a static analyzer. When I have a chance, I will try to find out if NDepend help me with this.

However, while this approach looks like some kind of hacking, it is still better than the alternative — a code full of errors. And ... I am ashamed to say, but I do not think that in Noda Time I found all the cyclical dependencies. It is worth trying it out on your own code - see where you may have hidden problems.

Application: Testing Code

TypeInitializationChecker

 internal sealed class TypeInitializationChecker : MarshalByRefObject { private static List<Dependency> dependencies = null; private static readonly MethodInfo EntryMethod = typeof(TypeInitializationChecker).GetMethod("FindDependencies"); internal static int RecordInitializationStart() { if (dependencies == null) { return 0; } Type previousType = null; foreach (var frame in new StackTrace().GetFrames()) { var method = frame.GetMethod(); if (method == EntryMethod) { break; } var declaringType = method.DeclaringType; if (method == declaringType.TypeInitializer) { if (previousType != null) { dependencies.Add(new Dependency(declaringType, previousType)); } previousType = declaringType; } } return 0; } /// <summary> /// Invoked from the unit tests, this finds the dependency chain for a single type /// by invoking its type initializer. /// </summary> public Dependency[] FindDependencies(string name) { dependencies = new List<Dependency>(); Type type = typeof(TypeInitializationChecker).Assembly.GetType(name, true); RuntimeHelpers.RunClassConstructor(type.TypeHandle); return dependencies.ToArray(); } /// <summary> /// A simple from/to tuple, which can be marshaled across AppDomains. /// </summary> internal sealed class Dependency : MarshalByRefObject { public string From { get; private set; } public string To { get; private set; } internal Dependency(Type from, Type to) { From = from.FullName; To = to.FullName; } } }

TypeInitializationTest

 [TestFixture] public class TypeInitializationTest { [Test] public void BuildInitializerLoops() { Assembly assembly = typeof(TypeInitializationChecker).Assembly; var dependencies = new List<TypeInitializationChecker.Dependency>(); // Test each type in a new AppDomain - we want to see what happens where each type is initialized first. // Note: Namespace prefix check is present to get this to survive in test runners which // inject extra types. (Seen with JetBrains.Profiler.Core.Instrumentation.DataOnStack.) foreach (var type in assembly.GetTypes().Where(t => t.FullName.StartsWith("NodaTime"))) { // Note: this won't be enough to load the assembly in all test runners. In particular, it fails in // NCrunch at the moment. AppDomainSetup setup = new AppDomainSetup { ApplicationBase = AppDomain.CurrentDomain.BaseDirectory }; AppDomain domain = AppDomain.CreateDomain("InitializationTest" + type.Name, AppDomain.CurrentDomain.Evidence, setup); var helper = (TypeInitializationChecker)domain.CreateInstanceAndUnwrap(assembly.FullName, typeof(TypeInitializationChecker).FullName); dependencies.AddRange(helper.FindDependencies(type.FullName)); } var lookup = dependencies.ToLookup(d => d.From, d => d.To); // This is less efficient than it might be, but I'm aiming for simplicity: starting at each type // which has a dependency, can we make a cycle? // See Tarjan's Algorithm in Wikipedia for ways this could be made more efficient. // http://en.wikipedia.org/wiki/Tarjan's_strongly_connected_components_algorithm foreach (var group in lookup) { Stack<string> path = new Stack<string>(); CheckForCycles(group.Key, path, lookup); } } private static void CheckForCycles(string next, Stack<string> path, ILookup<string, string> dependencyLookup) { if (path.Contains(next)) { Assert.Fail("Type initializer cycle: {0}-{1}", string.Join("-", path.Reverse().ToArray()), next); } path.Push(next); foreach (var candidate in dependencyLookup[next].Distinct()) { CheckForCycles(candidate, path, dependencyLookup); } path.Pop(); } }

Source: https://habr.com/ru/post/143936/

All Articles