Continuing the
week of expressing opinions about higher education at Habré , I’ll tell you about the competition, which is somewhere in the middle, between work (as denied by the respected
AlexanderLarin ) and the ability to learn (as promoted by the user
Alexnn ).

And I will tell you about the
Acceler8 parallel programming
contest . Yes, this is a contest in which there is an opportunity not only to
learn , but also to
work for one of the “small” prizes:
2 powerful HP laptops and 10 Asus netbooks .
Want to try your hand?
At first glance, this time the competition is quite simple: you need to solve only one programming task.
At second glance, the solution should be parallel and, if possible, scalable.
So what are we talking about?
Briefly about the task
Given a two-dimensional matrix of values, you need to find a rectangular submatrix in it with the maximum sum of elements in it among all submatrices.
More information about the task can be found on
the contest page .
Briefly about the decision
It is clear that the successive search into the forehead will solve the problem, but the solution will be too slow:
int maxSum = A[0][0]; int maxI0 = 0, maxJ0 = 0, maxI1 = 1, maxJ1 = 1; for (int i = 0; i < M; i++) for (int j = 0; j < N; j++) for (int width = 1; width <= M - i; width++) for (int height = 1; height <= N - j; height++) { int currentSum = sum_subarray(A, i, j, i + width, j + height); if (currentSum > maxSum) { maxSum = currentSum; maxI0 = i; maxJ0 = j; maxI1 = i + width - 1; maxJ1 = j + height - 1; } } cout << "Max subarray: [" << maxI0 << "][" << maxJ0 << "] .. [" << maxI1 << "][" << maxJ1 << "]"; cout << endl << "Max value = " << maxSum << endl; cout << "Area = " << (maxI1 - maxI0 + 1) * (maxJ1 - maxJ0 + 1) << endl;
Obviously, such an approach is “not at all epic”.
')
How to solve intelligently
There are many approaches to this problem.
We will not reveal all the cards. Now everyone is on an equal footing - a month of time (decisions are made until November 15) and free access to Google and textbooks.
Sounds easy, doesn't it?
And the most delicious - for a snack.
How will we evaluate scalability?
There are many approaches and a
lot of flood on this issue. We decided to take the simplest implementation, but also the solution closest to reality:
The code will be tested on two different platforms. The first is a 40-core
Manycore Testing Lab machine, with well-known hardware features, a well-known brand of processor and a known second-level cache size (see “
MTL Configuration ”). The second is a small system, about which almost nothing is unknown (2-4 cores, a few gigs of memory — that's all the information).
Is it difficult in such a situation to optimize the solution? Yes.
On the other hand, it is possible to work the fastest on a “small machine” and moderately on MTL — this may be enough to win.
Or maybe someone will achieve better results on both machines? Time will tell.
So, for those interested, a small link:
registration for the competition . Questions can be written in the comments or leave on the
forum of the competition .
And, this,
be epic ! ;)