📜 ⬆️ ⬇️

"Accelerate 2012 parallel programming contest" or "6 ultrabooks and 10 SSD will be enough for everyone!"


Hello!
Last week on Habré was marked by a series of hacker posts - they cracked both VoIP and online traffic jams .
I propose to continue the week more creatively - to solve the global problem of genetics on parallel programming.
In a month, you need to do nothing at all: find in two strings consisting of nucleotides of the characters A, T, G and C, the longest common substring .
The prizes have grown and strengthened compared to the previous time - today there are 6 Asus Zenbook UX31E ultrabooks and 10 SSD disks with a total capacity of 800 gigs at stake.
Tempting?

What is it about?


You are given a reference string (for example, the following: GATGAGCATGTGTTGAATCCTCA ) and many long input lines (here’s one of them: GTCCTCCAGTTTGTAGCATGTGTATTTATGTCCTCCAGTTTGTAGCATGTGTATTTAT ). It is necessary for each pair of reference and input strings to find the longest common substrings and return their “coordinates”. In the example, the answer is (5 13 15 23) and (5 13 44 52) , that is, two substrings are found:
 #  ,       ,  '-1' ref = 'GATGAGCATGTGTTGAATCCTCA' input = 'GTCCTCCAGTTTGTAGCATGTGTATTTATGTCCTCCAGTTTGTAGCATGTGTATTTAT' ref[(5 - 1):13] == input[(15 - 1):23] #True ref[(5 - 1):13] == input[(44 - 1):52] #True 

At your disposal there is a reference code that correctly, but very slowly solves this problem by brute force.
Sounds easy?

How to solve?


This is where the fun begins. I will offer several options that may be useful:

')

What to choose? I want a hint!


What you choose, we can not advise. Some people like to write on pthreads, someone on OpenMP, and someone likes to use parallel functions from TBB and maybe even MKL.
One thing is for sure - very clever ideas and thoughts are often discussed on our forum . For example, it is worth looking at the instructions in SSE4.2.

What can I write?


Unfortunately, our automated testing system is too young to support all programming languages.
This time we taught her to understand C ++ (despite my sincere love of python and Java Script), so I’ll have to write in good old C ++.
The development platform can be any, but the code will be compiled and tested on a Linux machine (Debian stable - kernel 2.6.32) with gcc installed (version 4.6.2 for pthreads fans) and Intel Parallel Studio XE 2011 (for those who choose Intel Compiler, optimizing code for our processors).

And what about the prizes? I want an ultrabook!


The first three teams that wrote the fastest code and sent the solution before May 16, we will give Asus Zenbook UX31E to the participant.
The second three teams - on SSD 320 Series 80Gb .
Another 4 participants who write us the most interesting posts in the blogs of the Intel Software Network will also get SSD.

So, once again: one task , one month, 6 ultrabooks and 10 SSD for the best participants from Russia and the CIS.

Anyone who decides to participate , good luck!

The organizers and judges of the competition are ready to answer any of your questions in the comments.

Source: https://habr.com/ru/post/142312/


All Articles