Convolutions in Intel Cilk Plus

Suppose for some reason we need to find the sum of the elements of the array. We can split the array into two parts, sum each part separately and add the results. At the same time, these parts can be summed up in parallel. But the summation of the part of the array is exactly the original problem, and each part can again be divided into two parts and summed each part separately, and then add the results, etc. This computational strategy is called “ divide and conquer ”.

In this way, many other functions from arrays can be computed; in the first part of the article, a mathematical explanation of this idea will be given, and in the second, how to use this idea in your programs using Intel Cilk Plus.

So, if there is a desire to look at the mathematical formulas and pieces of code in C ++ in the last days of summer, then welcome to habrakat.
')

Monoids and homomorphisms

The definition of a monoid with a multitude of examples and a very detailed explanation can be found in habrastiat Monoids and their applications: monoidal calculations in trees , so we confine ourselves to definition and a couple of examples.

A monoid is a triple ( M , ⋅, e ), where M is a non-empty set, ⋅ is a binary associative operation on the set M , e is an element of the set M that is neutral with respect to the operation ⋅.

Since the operation ⋅ is associative, that is, for any x , y and z from M , it is true that x ⋅ ( y z ) = ( x ⋅ y ) ⋅ z , the result of evaluating the expression x ₁ x ₂ ... ⋅ x _n does not depend on the arrangement of brackets, therefore, it is possible not to write brackets at all. The proof of this simple, but important fact (as well as all subsequent simple, but important facts) can be found in some textbook on higher algebra or to invent it yourself.

There are a lot of examples of monoids, for example, from the sixth grade of school, anyone knows that the set of integers by addition ( Z , +, 0) is a monoid. Another example of a monoid is a list monoid. Namely, let X be an arbitrary set, denote by X ^{∗ the} set of all finite sequences of elements of X , including the empty sequence [], and the symbol ++ denote the concatenation of lists. Then, as it is not difficult to be convinced, the triple ( X ^∗ , ++, []) is a monoid.

Next we need the concept of a homomorphism. Let ( M , ⋅ _M , e _M ) and ( N , _N , e _N ) be two arbitrary monoids. A function h : M → N is called a homomorphism if h ( x ⋅ _M y ) = h ( x ) _N h ( y ) for any x and y from M , and h ( e _M ) = e _N.

Note that, strictly speaking, a function is a homomorphism if it can be dragged through the sign of a binary operation. It is clear that the identity mapping is a homomorphism, but from the school and university mathematics courses everyone knows some more interesting examples of homomorphisms. For example, the logarithm of a product is equal to the sum of logarithms , ln ( xy ) = ln x + ln y , and in our terms this means that the mapping ln is a homomorphism from a monoid of positive real numbers by multiplication ( R ₊ ,, 1) into a monoid of real numbers by addition ( R , +, 0). Further, the determinant of the product of matrices is equal to the product of the determinants of these matrices , det ( AB ) = det A ⋅ det B , that is, the map det is a homomorphism from a monoid of square matrices by multiplication (Mat _n ( R ),, I ) into a monoid real multiplication numbers ( R , ⋅, 1). I hope the idea is clear. We now turn to list homomorphisms.

Any homomorphism h from a list monoid ( X ^∗ , ++, []) into an arbitrary monoid ( M ,, e ) is called a list homomorphism . List homomorphisms have the following useful property. Let [ x ₁ , x ₂ , ..., x _n ] be an arbitrary list, then h ([ x ₁ , x ₂ , ..., x _n ]) = h ([ x ₁ ] ++ [ x ₂ ] ++ ... + + [ x _n ]) = h ([ x ₁ ]) ⋅ h ([ x ₂ ]) ⋅… ⋅ h ([ x _n ]). Thus, any list homomorphism is uniquely determined by its values on singleton lists. So, let f : X → M be a function such that f ( x ) = h ([ x ]), then the uniquely defined list homomorphism h will be written in the form hom (, f , e ).

Now a few examples.

Summation of the elements of the list of numbers, the product of the elements of the list of numbers, finding the minimum and maximum elements of the list of numbers can be considered as list homomorphisms hom (+, id, 0), hom (, id, 1), hom (max, id, −∞) and hom (min, id, + ∞), respectively. Similar list homomorphisms, in which the second parameter is the identity map id, we will briefly write reduce (, e).
The function length, which calculates the length of the list, is a list homomorphism hom (+, one, 0), where one ( x ) = 1.
A function that applies a certain function f to each element of the list is also a list homomorphism, namely hom (++, g , []), where g ( x ) = [ f ( x )]. Such functions will be briefly written as map ( f ).
The sort function can also be represented as a list homomorphism. Namely, let merge be a merge function of two ordered lists, list ( x ) = [ x ] is a function that turns an element into a single-element list. Then the homomorphism hom (merge, list, []) is the desired list homomorphism.

Note that, although the homomorphism is called a list, a list can be understood as arrays, strings, or in general, sequences generated on the fly. Now, we formulate three useful statements, known as theorems on homomorphism.

The first theorem on homomorphism. Any list homomorphism can be represented as a composition hom (, f , e ) = reduce (, e) ∘ map ( f ).

It is on this simple observation that the idea of MapReduce is based. A simple introduction to MapReduce can be found in MapReduce habrastatyah or calculations outside of the memory and processor (try without any zaumi) and MapReduce: more advanced examples, we will try without any zaumi .

Let [ x ₁ , x ₂ , ..., x _n ] be an arbitrary list, the left fold is the function foldl (⋅ _L , e _L ) ([ x ₁ , x ₂ , ..., x _n ]) = (... ((e _L ⋅ _L x ₁ ) ⋅ _L x ₂ ) ⋅ _L ... ⋅ _L x _n ), and the right folding is the function foldr ( _R , e _R ) ([ x ₁ , x ₂ , ..., x _n ]) = ( x ₁ ⋅ _R ( x ₂ ⋅ _R ... ⋅ _R ( x _n ⋅ _R e _R ) ...)). The second theorem on homomorphism states the following.

The second theorem on homomorphism . Any list homomorphism is expressed as a left and a right convolution, i.e. hom (, f , e ) = foldl ( _L , e ) = foldr ( _R , e ), where x ⋅ _L y = x = f ( y ), x ⋅ _R y = f ( x ) ⋅ y .

This theorem indicates two successive ways to calculate list homomorphisms - in the form of the left and in the form of the right convolution. How the functions that compute these convolutions are called in specific programming languages can be found in the Fold article of Wikipedia (higher-order function) . More information about convolutions can be found in the article by Evgeny Kirpichev Elements of functional languages .

For us, a more interesting method is the parallel computation of the list homomorphism h = hom (, f , e). Recall that, by definition, the list homomorphism h ( x ++ y ) = h ( x ) ⋅ h ( y ), in other words, you can independently, and, therefore, in parallel, calculate h ( x ) and h ( y ), and then calculate the final value of h ( x ++ y ). But we can also split each of the x and y lists into two parts, and, using the definition of a homomorphism, calculate the values of h for them independently, and then collect the result, etc. This is nothing more than a divide-and-conquer strategy .

Suppose that the operation ⋅ is calculated in constant time C , and the list x consists of n elements. To calculate the left convolution, it is necessary to calculate the operation ⋅ exactly n times, that is, it takes Cn of time. If we have p processors, then we divide the list of x into p equal parts x _i and use the left convolution for each x _{i to} calculate h in the time of Cn / p , and then in the time of C log ₂ p we calculate the final result. Thus, the total time is C ( n / p + log ₂ p ).

Finally, we formulate the third theorem on a homomorphism, which says when a function is a list homomorphism.

The third homomorphism theorem. If some function from a list monoid to some monoid is expressed as a left foldl (⋅ _L , e ) and as a right convolution foldr ( _R , e ) , then it is a list homomorphism.

Evidence for these three theorems can be found in Jeremy Gibbons' article The Third Homomorphism Theorem , and we turn to Intel Cilk Plus.

Convolutions in Intel Cilk Plus

Intel Cilk Plus is an extension of the C ++ language for writing multi-threaded programs, which among other things includes the following:

keywords cilk_spawn and cilk_sync (the first indicates that the function can be executed in parallel, the second is used for synchronization),
the cilk_for keyword for parallelizing loops
hotspots for thread-safe data separation,
special syntax for writing arrays and operations on arrays, for example, elementwise addition of arrays will be written as c[:] = a[:] + b[:] .

Only the second and third points will be affected here. Documentation and other materials about Intel Cilk Plus can be found at http://software.intel.com/en-us/articles/intel-cilk-plus/ .

So, suppose that we need to calculate the sum of the elements of an array of integers. A possible consistent implementation is shown below.

 int sum = 0; for (int i = 0; i < n; ++i) { sum += a[i]; } std::cout << sum << std::endl;

Note that this is nothing but the implementation of the left convolution. But the sum of the elements of the array is a list homomorphism, and, therefore, also admits a parallel implementation, as explained above. Intel Cilk Plus allows you to make such an implementation without any effort, it is enough to do the following:

include cilk/reducer_opadd.h and cilk/cilk.h header files,
change sum type to reducer_opadd<int> ,
change for to cilk_for ,
use get_value() to get the result.

 #include <cilk/cilk.h> #include <cilk/reducer_opadd.h> ... cilk::reducer_opadd<int> sum; cilk_for (int i = 0; i < n; ++i) { sum += a[i]; } std::cout << sum.get_value() << std::endl;

In this case, something like the following will happen. The compiler converts the body of the cilk_for loop into a function that implements the "divide and conquer" strategy; in pseudo-code, this is written as follows.

 void run_loop(first, last) { if ((last - first) < grainsize) { for (int i = first; i < last; ++i) { LOOP_BODY; } } else { int mid = (last - first) / 2; cilk_spawn run_loop(first, mid); run_loop(mid, last); } }

And since the sum variable is a reducer, then if the thread is split into two and their parallel execution, the daughter will use the old view, and the sequel will receive a new view, initialized to zero, while synchronizing, both views will be added.

In addition to reducer_opadd Intel Cilk Plus provides a large set of reducers: reducer_list_append , reducer_list_prepend , reducer_min , reducer_max , reducer_min_index , reducer_max_index , reducer_opor , reducer_opand , reducer_opxor , reducer_basic_string , reducer_string , reducer_wstring , reducer_ostream If none of them fit, then you can create your own reducer.

To create a reducer, you need to create a Monoid class that implements a monoid. This class should contain the following functions:

reduce(T *left, T *right) calculates *left = *left OP *right ,
identity(T *p) creates a neutral element in the memory area pointed to by p ,
destroy(T *p) calls the destructor for the object pointed to by p ,
allocate(size) returns a pointer to a memory region of size bytes,
deallocate(p) frees the memory pointed to by p .

In most cases, all these methods need not be implemented, it suffices to inherit from the class cilk::monoid_base<T> and implement reduce() and, possibly, identity() .

Further, the class that implements a reducer should contain the cilk::reducer<Monoid> imp_ hot cilk::reducer<Monoid> imp_ as a hidden member and methods for accessing and changing the hot cilk::reducer<Monoid> imp_ .

We write a reducer to find the product of the elements of an array of numbers.

 #include <cilk/reducer.h> #include <cilk/cilk.h> #include <iostream> class reducer_prod { public: struct Monoid: cilk::monoid_base<double> { void reduce(double *left, double *right) const { *left *= *right; } void identity(double* p) const { new ((void*) p) double(1.0); } }; reducer_prod(): imp_(1.0) { } reducer_prod &operator*=(const double &v) { imp_.view() *= v; return *this; } double get_value() const { return imp_.view(); } private: cilk::reducer<Monoid> imp_; };

 reducer_prod prod; cilk_for (int i = 0; i < n; ++i) { prod *= a[i]; } std::cout << prod.get_value() << std::endl;

Source: https://habr.com/ru/post/126275/

All Articles

Convolutions in Intel Cilk Plus

Monoids and homomorphisms

Convolutions in Intel Cilk Plus

More articles: