The book "High-performance code on the platform. NET. 2nd edition »

This book will teach you how to maximize the performance of managed code, ideally without sacrificing any of the benefits of the .NET environment, and in the worst case sacrificing the minimum number. You will learn rational programming techniques, learn what to avoid and, perhaps most importantly, how to use freely available tools to measure your level of productivity without much difficulty. In the educational material will be a minimum of water - only the most necessary. The book gives exactly what you need to know, it is relevant and concise, does not contain too much. Most chapters begin with general information and background, followed by specific tips, like a recipe, and at the end - a section for step-by-step measurements and debugging for a variety of different scenarios.

Along the way, Ben Watson plunges into specific components of the .NET environment, in particular, the Common Language Runtime (CLR), which forms the basis for its foundation, and see how your machine’s memory is managed, code is generated, multithreaded execution is being done, and much more is done . You will be shown how the .NET architecture simultaneously limits your software and gives it additional capabilities and how the choice of programming paths can significantly affect the overall performance of the application. As a bonus, the author will share with you stories from the experience of creating very large, complex, high-performance .NET-systems at Microsoft in the past nine years.

Snippet. Choose the appropriate size of the thread pool.

Over time, the thread pool is configured independently, but at the very beginning it does not have a history and it will be launched in the initial state. If your software product is extremely asynchronous and significantly utilizes the CPU, it may suffer from prohibitively high initial startup costs while waiting for the creation and availability of even more threads. Faster achievement of the steady state will help the adjustment of the starting parameters so that from the moment the application is launched, you have at your disposal a certain number of ready threads:

const int MinWorkerThreads = 25; const int MinIoThreads = 25; ThreadPool.SetMinThreads(MinWorkerThreads, MinIoThreads);

Here it is necessary to proceed with caution. When using Task objects, they will be dispatched based on the number of threads available for this. If there are too many of them, Task objects can be subject to excessive dispatching, which at a minimum will lead to a decrease in the efficiency of application of the central processor due to more frequent context switching. If the workload is not so high, the thread pool will be able to switch to an algorithm that can reduce the number of threads, bringing it to a number below the specified one.
')
You can also set their maximum using the SetMaxThreads method, but this technique is subject to similar risks.

To find out how many threads you need, leave this parameter alone and analyze your application in a steady state using the methods ThreadPool.GetMaxThreads and ThreadPool.GetMinThreads or performance counters that show the number of threads involved in the process.

Do not interrupt flows

Interrupting the work of threads without agreeing with the work of other threads is a rather dangerous procedure. Threads must cleanse themselves, and the Abort method call for them does not allow closing them without negative consequences. When a stream is destroyed, parts of the application are in an undefined state. It would be better to perform an emergency exit from the program, but ideally you need a clean restart.

To safely shut down a thread, you need to use some kind of shared state, and the thread function itself should check this state to determine when its operation should end. Security must be achieved through consistency.

In general, you should always use Task-objects - the API for interrupting a Task is not provided. To be able to consistently terminate the flow, you need, as noted earlier, to use the CancellationToken.

Do not change the priority of threads.

In general, changing the priority of threads is an extremely unfortunate idea. In Windows, threads are dispatched according to their priority levels. If high-priority flows are always ready to start, then low-priority ones will be ignored and rarely get a chance to start. Increasing the priority of a thread, you say that its work should take precedence over the rest of the work, including other processes. It is unsafe for a stable system.

It is better to lower the priority of a thread if it is running something that can wait until the completion of tasks of normal priority. One of the compelling reasons for lowering the priority of a thread can be the detection of an out-of-control thread performing an infinite loop. It is impossible to safely stop a thread, so the only way to return this thread and processor resources is to restart the process. Until it becomes possible to close the stream and do it cleanly, lowering the priority of the out-of-control stream would be a reasonable means of minimizing the consequences. It should be noted that even threads with a lower priority still have a guaranteed start over time: the longer they are deprived of starts, the higher their dynamic priority set by Windows will be. The exception is the idle time priority THREAD_ ‑ PRIORITY_IDLE, in which the operating system only schedules the execution of a thread when it literally has nothing more to run.

There may be well justified reasons for raising the priority of the flow, for example, the need to quickly respond to rare situations. But to use such techniques should be very cautious. Dispatching threads in Windows is done independently of the processes to which they belong, so the high-priority thread from your process will be launched to the detriment of not only your other threads, but all threads from other applications running on your system.

If a thread pool is used, then any changes to priorities are reset each time the thread returns to the pool. If using the Task Parallel library to continue managing base streams, it should be borne in mind that several tasks can be started on the same stream before it returns to the pool.

Synchronization of threads and locks

As soon as it comes to multiple threads, it becomes necessary to synchronize them. Synchronization consists in providing access to only one stream to a shared state, for example, to a class field. Normally, synchronization of threads is performed using synchronization objects such as Monitor, Semaphore, ManualResetEvent, etc. Sometimes they are informally called locks, and the synchronization process in a particular stream is called blocking.

One of the fundamental truths about locks is this: they never improve performance. At best, if there is a well-implemented synchronization primitive and there is no contention, the lock can be neutral. It leads to stopping the performance of useful work by other threads and to the fact that the CPU time is wasted, increases the context switching time and causes other negative consequences. This has to be tolerated because correctness is much more important than simple performance. Whether the wrong result is quickly calculated, does not play any role!

Before we begin to solve the problem of using the device lock, consider the most fundamental principles.

Do I need to worry about performance at all?

First justify the need to improve performance. This brings us back to the principles discussed in Chapter 1. Performance is not equally important for all of your application’s code. Not all code has to be optimized for degree n. As a rule, it all starts with an “inner loop,” the code executed most often or most critical to performance, and spreads in all directions until the costs exceed the benefits. There are many areas in the code that are much less important in terms of performance. In such a situation, if you need a lock, calmly apply it.

And now it is necessary to show prudence. If your noncritical code snippet runs in a thread from the thread pool and you block it for a long time, the thread pool can begin to insert more threads to cope with other requests. If one or two threads do it from time to time, no big deal. But if such things are doing a lot of flows, a problem may arise, because of this, the resources that have to do the real work are wasted because of this. Inadvertently starting a program with a significant constant load can cause a negative impact on the system even from those parts for which high performance is not important, due to unnecessary context switches or unreasonable use of the thread pool. As in all other cases, to assess the situation you need to perform measurements.

Do you really need a lock?

The most effective locking mechanism is the one that does not exist. If you can get rid of the need to synchronize threads at all, this will be the best way to get high performance. This is the ideal, which is not so easy to achieve. This usually means that you need to ensure that there is no mutable shared state — every request that passes through your application can be processed independently of another request or some centralized, changeable (via read-write) data. Such an opportunity will be the best scenario for achieving high performance.

And yet be careful. With restructuring, it’s easy to overdo it and turn the code into a messy mess that no one, including yourself, can figure out. Do not go too far, unless high performance is a truly critical factor and cannot be achieved otherwise. Turn the code into asynchronous and independent, but so that it remains understandable.

If several threads simply read from a variable (and there are no hints of writing to it from any thread), synchronization is not needed. All threads can have unlimited access. This automatically propagates to immutable objects such as strings or values of immutable types, but can relate to any type of objects, if it is guaranteed that its values are immutable during the reading by multiple threads.

If there are multiple threads leading to writing to a shared variable, see if you can eliminate the need for synchronized access by switching to using a local variable. If you can create a temporary copy for your work, the need for synchronization will disappear. This is especially important for repetitive synchronized access. From re-accessing a shared variable, you need to switch to re-accessing a local variable following the one-time access to the shared variable, as in the following simple example of adding elements to a collection shared by several threads.

 object syncObj = new object(); var masterList = new List<long >(); const int NumTasks = 8; Task[] tasks = new Task[NumTasks]; for (int i = 0; i < NumTasks; i++) { tasks[i] = Task.Run(()=> { for (int j = 0; j < 5000000; j++) { lock (syncObj) { masterList.Add(j); } } }); } Task.WaitAll(tasks);

This code can be converted as follows:

 object syncObj = new object(); var masterList = new List<long >(); const int NumTasks = 8; Task[] tasks = new Task[NumTasks]; for (int i = 0; i < NumTasks; i++) { tasks[i] = Task.Run(()=> { var localList = new List<long >(); for (int j = 0; j < 5000000; j++) { localList.Add(j); } lock (syncObj) { masterList.AddRange(localList); } }); } Task.WaitAll(tasks);

On my machine, the second version of the code runs more than twice as fast as the first.
Ultimately, a variable shared state is the principal enemy of performance. It requires synchronization for data security, which degrades performance. If your design has even the slightest opportunity to avoid blocking, then you are close to implementing an ideal multi-threaded system.

Sync Preference Order

When deciding whether any type of synchronization is necessary, it should be understood that not all of them have the same performance or behavior characteristics. In most situations, you just need to use a lock, and usually this should be the initial option. Using something other than blocking to justify additional complexity requires intensive measurements. In general, consider the synchronization mechanisms in the following order.

1. lock / class Monitor - preserves simplicity, availability of code for understanding and provides a good balance of performance.

2. Complete lack of synchronization. Get rid of shared variable states, restructure and optimize. This is more difficult, but if it works, it will generally work better than the use of a lock (except when errors are made or the architecture is degraded).

3. Simple Interlocked Interlocked Methods - In some scenarios, it may be more appropriate, but as soon as the situation starts to get complicated, go on to use the lock.

And finally, if you can really prove the benefits of using them, use more sophisticated, complex locks (keep in mind: they are rarely as useful as you expect):

asynchronous locks (to be discussed later in this chapter);
other.

Specific circumstances may dictate or impede the use of some of these technologies. For example, combining several Interlocked methods is unlikely to outperform a single lock statement.

»More information about the book can be found on the publisher site
» Table of Contents
» Excerpt

For Habrozhiteley 25% discount coupon - .NET

Upon payment of the paper version of the book, an e-book is sent to the e-mail.

Source: https://habr.com/ru/post/458520/

All Articles