Basics of multithreading in the .NET Framework

Multithreading is one of the most difficult topics in programming; there are always a lot of problems with it. Without a clear understanding of the internal mechanisms, it will be very difficult to predict the outcome of an application using multiple threads. We will not duplicate here the mass of theoretical information, which is very much in the network and smart books. Instead, focus on the specific and most important issues that need special attention and be sure to keep them in mind during the development process.

Streams

As everyone probably knows, the thread in the .NET Framework is represented as a Thread class. Developers can create new threads, give them meaningful names, change priority, start, wait for shutdown, or stop.

Streams are divided into background (background) and foreground (the main one, the one in the foreground). The main difference between them is that the foreground streams impede the end of the program. As soon as all foreground streams are stopped, the system will automatically stop all background and terminate the application. To determine whether a stream is background or not, the following property of the current stream must be called:
')

Thread.CurrentThread.IsBackground

By default, when creating a thread using the Thread class, we get the foreground stream. In order to change it to the background, we can use the thread.IsBackground property.

In applications that have a user interface (UI), there is always at least one main (GUI) thread that is responsible for the state of the interface components. It is important to know that only this, the so-called “UI stream”, which is created for an application, usually in a single copy (although not always), can change the view state.

It is also worth mentioning about exceptional situations that can occur in child threads. In such a situation, the application will be terminated urgently, and we will get a Unhandled Exception , even if we wrap the startup code of the stream in a try / catch block . In this case, error handling must be placed in the code of the child thread, in which it will be possible to respond to a specific exception.

Using global exception handling ( Application_Error in ASP.NET, Application.DispatcherUnhandledException in WPF, Application.ThreadException in WinForms, etc.) it is important to remember that with this approach we will be able to “catch” exceptions that ONLY occurred in the UI stream, that is, we do not “catch” exceptions from additional background threads. We can also use AppDomain.CurrentDomain.UnhandledException and hook into the process of processing all unhandled exceptions within the application domain, but we can’t prevent the application from terminating.

Threads are expensive objects that take up memory, can use various system resources and are in different states. To create them takes time. Compared to processes, they are less resource-intensive, but still require fairly large costs of creation and destruction. Moreover, the developer is responsible for releasing the resources held by a specific flow. For example, to perform a mass of small tasks it is not efficient to start multiple threads, since the costs of starting them may exceed the benefit of use. In order to be able to reuse already running threads and get rid of the creation costs, a so-called pool thread ( ThreadPool ) was introduced.

ThreadPool

Within each process, the CLR creates one additional abstraction called a thread pool. It is a set of threads that are in standby mode and ready to perform any useful work. When the application starts, the pool of threads starts the minimum number of threads that are in the state of waiting for new tasks. If there are not enough active threads to efficiently perform tasks in the pool, it launches new ones and uses them according to the same reuse principle. The pool is quite smart and is able to determine the required effective number of threads, as well as stop unnecessary or run additional ones. You can set the maximum and minimum number of threads, but in practice this is rarely done.

The threads inside the pool are divided into two groups: worker and I / O threads. Workflows focus on work related to CPU loading (CPU based), while I / O flows focus on working with I / O devices: a file system, a network card, and others. If you try to perform an I / O operation on a workflow (CPU based), then it will be a waste of resources, since the thread will be in a state of waiting for an I / O operation to complete. Separate I / O streams are intended for such tasks. When using a thread pool, this is hidden explicitly from the developers. You can get the number of different threads in the pool using the code:

 ThreadPool.GetAvailableThreads(out workerThreads, out competitionPortThreads);

In order to determine whether the current thread is taken from a pool or created manually, you need to use the following construction:

 Thread.CurrentThread.IsThreadPoolThread

You can start a task using a stream taken in a pool using:

ThreadPool: ThreadPool.QueueUserWorkItem
asynchronous delegates (a pair of delegate methods: BeginInvoke () and EndInvoke ())
BackgroundWorker class
TPL (Task Parallel Library, which we'll talk about later)

The following constructs also use a thread pool, but they do it implicitly, which is important to know and remember:

WCF, Remoting, ASP.NET, ASMX Web Services
System.Timers.Timer and System.Threading.Timer
EAP (the event-based asynchronous pattern, let's talk about it later)
PLINQ

It is useful to keep in mind the following points:

Threads from the pool cannot be assigned a name.
Threads from the pool are always background.
Blocking threads from the pool can lead to the launch of additional threads and performance degradation.
You can change the priority of the thread from the pool, but it will return to the default value ( normal ) after returning to the pool

Synchronization

When building a multi-threaded application, it is necessary to ensure that any part of the shared data is protected from the possibility of changing their values by multiple threads. Given that the managed heap is one of the resources shared by the threads, and all the threads in the AppDomain have concurrent access to the shared data of the application, it is obvious that access to such shared data must be synchronized. This ensures that only one stream (or the specified amount, in the case of a Semaphore) will receive access to a specific block of code at a time. Thus, we can guarantee the integrity of the data, as well as their relevance at any time. Let's look at possible synchronization options and frequent problems. Speaking of synchronization, usually there are 4 types:

Blocking calling code
Constructions restricting access to pieces of code
Signal Constructions
Non-blocking lock

Blocking

By locking is meant waiting for one thread to complete another or to be in standby mode for some time. It is usually implemented using the methods of the class Thread: Sleep () and Join () , the method EndInvoke () of asynchronous delegates or using task ( Task ) and their waiting mechanisms. The following constructs are examples of poor expectation approaches:

 while (!proceed); while (DateTime.Now < nextStartTime);

Such designs require a lot of processor resources, although they do not do any useful work. At the same time, the OS and CLR think that our thread is busy with performing important calculations and allocate the necessary resources for it. This approach should always be avoided.

A similar example would be the following construction:

 while (!proceed) Thread.Sleep(10);

Here, the calling thread periodically falls asleep for a short time, but it is sufficient for the system to switch contexts and perform other tasks in parallel. This approach is much better than the previous one, but still not perfect. The main problem arises at the moment when it is necessary to change the proceed flag from different threads. Such a construction would be an effective solution if we expect the condition in the cycle to be satisfied in a very short time and entail a small number of iterations. If there are many iterations, the system will need to constantly switch the context of this stream and spend additional resources on it.

Locking

Exclusive locking is used to ensure that only one thread will execute a particular piece of code. This is necessary to ensure that the data is current at any time. In the .NET Framework there are quite a few mechanisms that allow you to block access to parts of the code, but we consider only the most popular ones. And at the same time we will analyze the most frequent errors associated with the use of such structures.

The table shows the most popular mechanisms for organizing locks. Using Mutexes, interprocessor locking can be implemented (and not just for several threads of the same process). A semaphore differs from Myutex in that it allows you to specify the number of threads or processes that can get simultaneous access to a particular section of code. The lock construct, which is a call to a pair of methods: Monitor.Enter () and Monitor.Exit () , is used very often, so we will consider possible problems and recommendations for its use.

Static members of classes that developers often operate on are always thread safe, and access to such data must be synchronized. The only difference is the static constructor, since the CLR blocks all calls from third-party threads to the static members of the class until the static constructor completes its work.

When using blocking with the lock keyword, keep in mind the following rules:

it is necessary to avoid blocking types:
```
 lock(typeof(object)) {…} 
```
<The fact is that each type is stored in a single copy within the same domain and this approach may lead to deadlocks. Therefore, you should avoid similar structures.
You must avoid blocking this object:
```
 lock(this) {…} 
```
This approach can also lead to a deadlock.
As a synchronization object, you can use an additional field in a particular class:
```
 lock(this.lockObject) {…} 
```
you need to use the Monitor.TryEnter construction (this.lockObject, 3000) when you are in doubt, and the stream can be blocked. This design will allow you to exit the lock after a specified time interval.
It is necessary to use the Interlocked class for atomic operations instead of similar constructions:
```
 lock (this.lockObject) { this.counter++; } 
```

Signaling

This mechanism allows the thread to stop and wait until it receives a notification from another thread about the possibility of continuing work.

The table shows the most common designs that are used in the "signaling". Using this approach will often be more effective than the previous ones.

Nonblocking synchronization

In addition to the mechanisms listed above, the .NET Framework provides constructs that can perform simple operations without blocking, stopping, or waiting for other threads. Due to the absence of locks and context switching, the code will work faster, but it is very easy to make a mistake, which is fraught with difficult problems. Ultimately, your code may become even slower than if you applied the common approach using lock . One of the variants of such synchronization is the use of so-called memory barriers ( Thread.MemoryBarrier () ), which impede optimization, caching of CPU registers and rearrangement of program instructions.

Another approach is to use the volatile keyword, which marks the required class fields. It causes the compiler to generate memory barriers every time it is read and written to a variable marked volatile . This approach is good in the case when you have one stream, or some streams only read, while others only write. If you need to read and modify in one stream, then you should use the lock operator.

Both of the above possibilities are quite difficult to understand, they require clear knowledge of memory models and optimizations at different levels, and therefore are used quite rarely. Yes, and they need to be applied very carefully and only when you understand what you are doing and why.

The simplest and recommended approach for atomic operations is the use of the Interlocked class, which was mentioned above. Behind the scenes, memory barriers are also generated, and we do not need to worry about additional locks. This class has quite a few methods for atomic operations, such as increase, decrease, change, change with comparison, etc.

Collections

It is useful to know that in the System.Collections.Concurrent namespace, quite a lot of thread-safe collections are defined for different tasks. The most common:

BlockingCollection
Concurrentbag
ConcurrentDictionary <TKey, TValue>
ConcurrentQueue
Concurrentstack

In most cases, there is no point in implementing your own such collection - it is much easier and wiser to use ready-made tested classes.

Asynchrony

Separately, I would like to highlight the so-called asynchrony, which, on the one hand, is always directly related to the launch of additional streams, and on the other, with additional questions and theory, which should also be discussed.

Let us demonstrate the difference between synchronous and asynchronous approaches in a clear example.

Suppose you want to eat pizza at the office and you have two options:

1st, synchronous option: walk on foot to the pizzeria, choose the pizza you are interested in, make an order, wait until it is brought, get to the office with the pizza or have lunch directly at the pizzeria, after which you will return and continue to work. In the process of walking and waiting for an order, you will be in standby mode and will not be able to do other useful work (for simplicity, it is understood here that the office work brings money and which you cannot perform outside the workplace).

2nd, asynchronous option: order a pizza by phone. After the order you are not blocked, you can perform useful work at the workplace while your order is being processed and delivered to the office.

Evolution

As the .NET Framework developed, there were many innovations and approaches for running asynchronous operations. The first solution for asynchronous tasks was an approach called APM ( Asynchronous Programming Model ). It is based on asynchronous delegates that use a pair of methods named BeginOperationName and EndOperationName , which respectively begin and end the asynchronous Operation OperationName . After calling the BeginOperationName method , the application can continue executing instructions in the calling thread while an asynchronous operation is being performed in another. For each call to the BeginOperationName method, the application must also have a call to the EndOperationName method to get the results of the operation.

This approach can be found in a variety of technologies and classes, but it is fraught with complexity and redundancy of the code.

In version 2.0, a new model was introduced called the Event-based Asynchronous Pattern (EAP). A class that supports an asynchronous model based on events will contain one or more methods MethodNameAsync . It can reflect synchronous versions that perform the same action with the current thread. This class can also contain the Event MethodNameCompleted event and the MethodNameAsyncCancel method (or just CancelAsync ) to cancel the operation. This approach is common when working with services. In Silverlight, it is used to access the server side, and Ajax is essentially an implementation of this approach. It is necessary to be afraid of long chains of related event calls, when, at the end of one long-term operation, the next, then the next, and so on, is called at the end of one long-term operation. This is fraught with deadlocks and unexpected results. Exception handling and results of an asynchronous operation are available only in the event handler through the corresponding parameter properties: Error and Result .

In .NET Framework 4.0, an advanced model called Task-based Asynchronous Model (TAP) was introduced, which is task -based . TPL and PLINQ are also built on them, but we'll talk about them in detail next time. This implementation of the asynchronous model is based on the Task and Task<Result>System.Threading.Tasks , which are used to provide arbitrary asynchronous operations. TAP is the recommended asynchronous pattern for developing new components. It is very important to understand the difference between a thread ( Thread ) and a task ( Task ), which are very different. Thread (thread) is an encapsulation of a thread of execution, while Task is a job (or simply an asynchronous operation) that can be executed in parallel. To perform the task, a free thread from the thread pool is used. Upon completion, the thread will be returned to the pool, and the class user will receive the result of the task. If you need to start a lengthy operation and you do not want to block one of the pool threads for a long time, then you can do it using the TaskCreationOptions.LongRunning parameter. You can create and run tasks in different ways, and it is often not clear which one to choose. The difference is mainly only in ease of use and the number of parameters with settings that are available in one way or another.

In the latest versions of the framework, new features have appeared based on the same tasks that simplify writing asynchronous code and make it more readable and understandable. To do this, we have introduced new keywords async and await , which mark asynchronous methods and their calls. Asynchronous code becomes very similar to synchronous: we simply call the desired operation and all code that follows its call will automatically be wrapped in a kind of callback that will be called after the completion of the asynchronous operation. Also, this approach allows you to handle exceptions in a synchronous manner; clearly wait for the completion of the operation; determine the actions to be performed and the relevant conditions. For example, we can add code that will be executed only if an exception was generated in an asynchronous operation. But not everything is so simple, even despite the mass of information on this topic.

async \ await

Consider the basic guidelines for using these keywords, as well as some interesting examples. Most often it is recommended to use asynchrony "from beginning to end." This implies the use of only one approach in a particular call or function block; do not mix synchronous calls with asynchronous ones. A classic example of this problem:

 public static class DeadlockDemo { private static async Task DelayAsync() { await Task.Delay(1000); } public static void Test() { var delayTask = DelayAsync(); delayTask.Wait(); } }

This code works fine in a console application, but when you call the DeadlockDemo.Test () method from a thread's GUI, a deadlock will occur. This is related to how await handles contexts. By default, when an incomplete Task is expected , the current context is captured and used to resume the method when the task is completed. The context is the current SynchronizationContext , unless it is null, as is the case with console applications. There it is the current TaskScheduler (thread pool context). GUI- and ASP.NET-applications have SynchronizationContext , which allows you to execute only one piece of code at a time. When an await expressioncompletes execution, it tries to execute the rest of the async method within the captured context. But he already has a thread that (synchronously) waits for the completion of the async method. It turns out that each of them waits for each other, causing a deadlock.

It is also recommended to avoid constructions of the form async void (an asynchronous method that returns nothing). Async methods can return Task , Task<Result>and void values . The last option was left for backward compatibility and allows you to add asynchronous event handlers. But it is worth remembering about some specific differences of similar methods, namely:

Exceptions can not be caught by standard means
, . , , Task .
, .

Always try to configure the context whenever possible. As already mentioned, the code inside the asynchronous method after calling await will require the synchronization context in which it was called. This is a very useful feature, especially in GUI applications, but sometimes it is not necessary. For example, when code does not need to access user interface elements. The previous example with deadlock can be easily corrected by changing just one line:

 await Task.Delay(1000).ConfigureAwait(false);

This recommendation is very relevant when developing any libraries that do not know anything about the GUI.

Consider a few more examples of the use of new keywords, as well as some features of their use:

1)

 private static async Task Test() { Thread.Sleep(1000); Console.Write("work"); await Task.Delay(1000); } private static void Demo() { var child = Test(); Console.Write("started"); child.Wait(); Console.Write("finished"); }

“Work” will appear on the screen first, then “started”, and only then “finished”. At first glance, it seems that the word "started" should be displayed first. Do not forget that in this code there is a problem with deadlock, which we considered. This is due to the fact that the method marked with the async keyword does not start additional threads and is processed synchronously until it encounters the await keyword inside . Only after that a new Task object will be created and a deferred task will be launched. To correct this behavior in the above example, it is enough to replace the line with Thread.Sleep (...) with await Task.Delay (...) .

2)

 async Task Demo() { Console.WriteLine("Before"); Task.Delay(1000); Console.WriteLine("After"); }

We can assume that we will expect 1 second before the second output to the screen, but this is not so - both messages will be displayed without delay. This is due to the fact that the Task.Delay () method , like many other asynchronous methods, returns an object of type Task, but we ignored this task. We do not expect it to be completed in any of the possible ways, which entails the immediate display of both messages.

3)

 Console.WriteLine("Before"); await Task.Factory.StartNew(async () => { await Task.Delay(1000); }); Console.WriteLine("After");

As in the previous example, the output to the screen will not be suspended for one second. This is because the StartNew () method accepts the delegate and returns Task<>where T is the type returned by the delegate. In the example, our delegate returns a Task. As a result, we get the result in the form Task<ask>. Using the word await “expects” only the completion of an external task, which immediately returns an internal Task created in the delegate, which is then ignored. You can fix this problem by rewriting the code as follows:

 await Task.Run(async () => { await Task.Delay(1000); });

four)

 async Task TestAsync() { await Task.Delay(1000); } void Handler() { TestAsync().Wait(); }

Despite the use of keywords, this code is not asynchronous and runs synchronously, because we are creating a task and we are clearly awaiting its execution. In this case, the calling thread is blocked and waits for the completion of the running task.

Conclusion

As you can see, developers have quite a lot of opportunities to work with multi-threaded applications. It is important not only to know the theory, but also to be able to apply effective approaches to solving specific problems. For example, the use of the Thread class almost unequivocally suggests that you have outdated code in the project, although the likelihood of having to use it is very small. In normal situations, the use of the pool is always justified, for obvious reasons.

The use of multithreading in applications with GUI usually entails additional restrictions, do not forget about them!

It is also worth remembering about other ready-made implementations, such as thread-safe collections. This eliminates the writing of additional code and prevents possible implementation errors. Well, do not forget about the features of new keywords.

Source: https://habr.com/ru/post/260745/

All Articles