📜 ⬆️ ⬇️

Talk about Intel® Cilk ™ Plus

My recent post about OpenMP 4.0 gave me an idea that it would be nice to write about Intel Cilk Plus, because the programming model is very interesting and certainly deserves special attention. Well, since its part became in fact a new OpenMP standard, then there were probably good reasons.

I'll start with the history of the name itself.
So, how it all began. Since 1994, MIT has developed the Cilk language, which made it easy to implement task parallelism. Moreover, it was an extension for the C language, because by removing all the Cilk keywords from the source, it turned into a completely correct and easily compiled code from the compiler. Naturally, in time, the commercial version of Cilk appeared, which was called Cilk ++. She, in turn, already supported C ++, and was also compatible with gcc and Microsoft compilers, and the commercial organization Cilk Arts, Inc. was already involved in the development. This is where Intel sneaked up by purchasing Cilk Arts, the Cilk ++ technology and the Cilk trademark. It is noteworthy that I myself began working at Intel since 2008, and I remember all the stages of Cilk’s development in our compiler. So, soon, namely in 2010, the first commercial version called Intel Cilk Plus, which is part of the C ++ compiler of Intel, was released. Why Plus, you ask? Yes, because in fact only half of Intel Cilk Plus from the technology Cilk ++, which allowed to introduce parallelism on tasks. The second half is the part that makes it possible to implement data parallelism and helps to vectorize the code. Schematically it looks like this:



Now you know the “secrets” of such a long name, and the secret meaning of Plus'a is the part that is responsible for vectorization. Clearly, marketers have tried and combined two different technologies under the “one roof”. By the way, it was the vector part that migrated to the new OpenMP, and it was about her that I already partially told in my previous post.
Here I will tell more about Cilk itself. By the way, the question is very rhetorical, which part is more important and significant for the developer. If we want to get maximum performance, then we need to use all types of parallelism, so everything is extremely useful. From personal experience, I use the vectoring part more often and with greater profit. Of course, this is not related to the fact that tasking Cilk is bad, I just often meet with using OpenMP to parallelize tasks. Although the Silkovskaya implementation is good.
')
The idea is simple - the minimum number of new keywords in the amount of 3 pieces, with the maximum return: cilk_spawn, cilk_sync and cilk_for . Inside is a modern, easy and efficient task scheduler that captures work for load balancing. But, about everything in order.

If you look at the skeleton of a function in which another function g () is called and the work is performed (abstractly speaking), then it looks like this in a sequential version:

void f() { g(); work work } void g() { work work } 

Now, we will turn the code into a parallel “flick of the wrist”:

 void f() { cilk_spawn g(); work work cilk_sync; work } 

What is going on here? We create a task (task) for the possible parallel execution of the g () function, and the work that remained in the f () function until the line cilk_sync (in Cilk's terms - continuation). A bit of terminology:

It is important that we do not create a thread (thread) and do not say what code to execute in which thread. All work is based on tasks, which allows you to effectively distribute the load and guarantee parallelism. How? Very simple.
We have a pool of threads, let's say for a simple example, that there are only 2 threads. And everyone has a queue of tasks to be performed. If there is an imbalance, that is, one thread is busy with work, and the other does not have it, then the tasks from the queue of another thread are captured.
And in our example, the continuation of the region will be captured. Like this:


Thus, we guarantee that all cores will be loaded with work. By the way, the same scheduler is implemented in Intel Threading Building Blocks (TBB). cilk_sync is a synchronization point.

The cilk_for construct is intended, as Captain Obvious says, to introduce concurrency in for loops. Why do we need a separate structure? I will not give a direct answer, but I will give a suggestive example. What is the difference between the two data cycle?

 for (int x = 0; x < n; ++x) { cilk_spawn f(x); } cilk_for (int x = 0; x < n; ++x) { f(x); } 

In the first case, at each iteration we will create a task, and the operation of capturing someone else's task is very costly in terms of performance. If in each iteration “little” work, then we will lose more than we get with the help of such a “parallel” program.

Obviously, spawn needs to be done not at each iteration, but, let's say, only once, and let all other iterations be perceived as continuation. I think the answer to the question about the need for cilk_for now disappears.

Actually, almost everything. It remains to solve the issue of shared memory. We have to take care of it ourselves, with the help of reducer'ov. Shared data are created using templates from Cilk, thereby ensuring safe operation with them.

Let's continue our simple example:

 int sum=3; void f() { cilk_spawn g(); work sum += 2; work } void g() { work sum++; work } 

It is clear that a "bad" situation arises with the common variable sum. To solve it, we need to declare it like this:

 cilk::reducer_opadd<int> sum(3); 

And you can write your own reducer, inheriting from the classes cilk :: monoid_base and cilk :: reducer. This, incidentally, has become possible in the latest version of OpenMP.
I hope that I have told enough to understand what is in the Intel Cilk Plus. Actually, there is almost everything there - both task concurrency through Cilk's keywords, and data concurrency using directives and new syntax (I haven't deliberately talked about this yet). As you can see, the technology is powerful and gives great potential to use all types of parallelism in your application. Dare, and “May the Force be with you”!

Source: https://habr.com/ru/post/204838/


All Articles