Threads are Goto Parallel Programming

Immediately uncover the thought in the title. The use of threads (also referred to as threads, threads, English threads) and the means of direct manipulation (creation, destruction, synchronization) for writing parallel applications have an equally detrimental effect on the complexity of the algorithms, the quality of the code and the speed of its debugging, which made use of the Goto operator in sequential programs.
As once programmers abandoned unstructured transitions, we need to abandon the direct use of streams, now and in the future. And just as each of us uses structural blocks instead of Goto, instead of flows, structures should be used built on top of them. Fortunately, all the tools for this appeared in quite traditional languages.

^{Author photo: Rainer Zenz}

First - a little history and references to the discussions that have already taken place.

Goto considered harmful

Probably the most authoritative nail in the coffin of the unfortunate operator at one time was driven by Edsger Dijkstra in his five-page 1968 article “A Case against the GO TO Statement” , also known as “Go-to statement considered harmful”.

On Habré, the topic of using / exiling Goto from programs in high-level languages has been raised repeatedly:
habrahabr.ru/post/114211
habrahabr.ru/post/114470
habrahabr.ru/post/114326
Undoubtedly, the existence of Goto is the source of endless holivar. However, modern “general purpose” languages, roughly starting with Java, do not include Goto in their syntax, at least in its original form.
')

Where is goto still running

I will note one frequently used, but not yet mentioned, application of the jump operation on a label that personally concerns me quite strongly: assembly languages and machine codes . Virtually all microprocessor architectures have conditional and unconditional jump instructions. Moreover, I don’t recall an assembler in which the for or while statements are made in hardware. As a result, programmers working at this level of abstraction are forced to deal with the whole mix of non-local transitions. Dijkstra has a remark about this: "... goto should be expelled from all high-level languages (ie, from everywhere, except - maybe - simple machine code )" [in the original: “everything except —perhaps — plain machine code "].

I will omit the description of all known arguments against Goto; Anyone can find them at the links above. I'll write the output right away, as I understand it: using Goto significantly lowers the “high level” of the code, hiding the algorithm in details of a consistent implementation . Let's go better to streams.

What is the problem with threads

To formulate where to expect problems from threads, see the article “The Problem with Threads” by Edward A. Lee. Its author tried to give some formalism (in my opinion, unnecessary) to explain the following fact. Direct use of threads requires an analysis of all possible alternations of basic operations that make up the individual execution threads. The number of such combinations grows like an avalanche with increasing application size and quickly surpasses the capabilities of human perception and analysis tools. Those. it is impossible to fully debug such a parallel program, let alone formal proof of correctness.
In addition to this crucial aspect, programming on threads (for example, on Pthreads) is not optimally simple in terms of the performance of both the programmer and the resulting application.

The lack of properties of the composition. Calling a library function out of a stream, without analyzing its code, one cannot say whether it will spawn some more parallel threads of execution and thereby exceeds the capabilities of the hardware (the so-called oversubscription ).
Thread parallelism cannot be made optional. It is always present and rigidly sewn into the logic of the program, despite the fact that, in reality, two related processes do not always have to work simultaneously; often, decisions must be made dynamically, taking into account the current situation and the availability of resources.
The difficulty of providing balancing mechanisms. Even a small bias in the speeds of different streams can significantly degrade the performance of the entire application (“the caravan goes at the speed of the slowest camel”). All concerns about the equipment being evenly loaded are shifted to the application programmer, who may not have enough information about the situation in the system. And this is not his business, in general - he must solve an applied problem.

The conclusion almost literally repeats the one that was made a little higher: the use of threads significantly reduces the “high level” of the code, hiding the algorithm in the details of a parallel implementation. “Manual control” of streams in a program written in a high-level language reveals many details of the underlying equipment that one would not like to see at the same time.

What if not streams?

How to use the capabilities of multi-core hardware without resorting to threads? Of course, there are various programming languages, originally designed with the expectation of efficient writing parallel programs. There are Erlang and functional languages. If extreme scalability of the solution is needed, the answer should be sought in them and the mechanisms they propose. But what should programmers who use more traditional languages, like C ++, and / or work with already existing code, do?

OpenMP - good, but not that

For quite a long time neither in C nor in C ++ (unlike, for example, the more “young” Java), the presence of parallelism in the programs was not reflected in any way, i.e. in fact, it was relegated to "third-party" libraries like Pthread. OpenMP has long been known for introducing structured fork-join parallelism in these languages, as well as in Fortran. In my opinion, this standard does not bring solutions related to the above flow problems. Those. OpenMP is still too low-level. The latest revision of the standard did not suggest raising the level of abstraction, but added opportunities (and difficulties) to those who want to use OpenMP to launch codes on heterogeneous systems (for more information about version 4.0 they wrote on Habré ).

Extensions and Libraries

Between new languages, initially trying to maintain parallelism, and traditional languages that completely ignore it, there are extensions — attempts to add the necessary abstractions and fix them in syntax — and libraries — wrapped up in already existing language concepts (such as calling subroutines) to solve problems. Language extensions theoretically allow us to achieve better results than libraries, because with their help we break out of the limitations of the original language, creating a new one. But very rarely, such extensions gain popularity among a wide audience of users. Recognition often comes only after the standardization of such an extension as part of the language.

Extensions to languages and libraries, including for parallel programming, are dealt with by many companies, universities, and combinations thereof. Intel has, of course, many of the first and second options mentioned on Habré both: Intel Cilk Plus , Intel Threading Building Blocks. I express my opinion that Cilk (Plus) is more interesting as a means of increasing the level of parallelism abstraction than TBB. Pleases the presence of his support in the GCC .

C ++ 11

In the latest C ++ standards, the parallel nature of modern computing has finally gained acceptance; the ability of the code to be executed simultaneously with something else is taken into account when describing many language constructs and standard classes. Moreover, the programmer can choose from a wide range of abstraction levels: from direct manipulation of threads through std::thread , through an asynchronous call to std::packaged_task to an asynchronous / lazy call to std::async . A lot of work to ensure the correct operation of all this machinery is shifted from third-party libraries to the standard one supplied with the compiler, which implements the capabilities of the new standard. An open (at least for me) question is the following: are there any C ++ 11 implementations that provide all three properties of high-level parallelism: composition, non-binding and balancing, and thus an application programmer freeing them from these worries.

What else to read

Finally, I want to share one book. Its main idea for me is that it is necessary to introduce an understanding of the existence of a structure for parallel applications in the design process. Moreover, it is necessary to teach students as early as possible, at about the same time when they are explained why “goto is bad”.

Michael McCool, Arch Robison, James Reinders. Structured Parallel Programming - 2012 - parallelbook.com .

The book, in particular, shows solutions to the same tasks using several libraries / languages of parallel programming: Intel Cilk Plus, OpenMP, Intel TBB, OpenCL and Intel ArBB. This allows us to compare the expressiveness and effectiveness of these approaches in various conditions of practical problems.

Thanks for attention!

Source: https://habr.com/ru/post/206030/

All Articles