Hi, Habr! I am the captain of the St. Petersburg State University team that participated in the ASC competitions. Last week
stealapanda published
an article about the experience of working with the world's most powerful supercomputer Sunway Taihulight. It became clear that many people heard about such an event for the first time. In my article, I want to tell you in general about HPC competitions, how they are held and what skills will be useful if you want to fit into this exciting adventure. Also on the example of ASC I will describe how it all goes.
What is HPC competition and how do they differ from conventional sports programming
HPC competitions - competitions of student teams in solving problems using computational clusters. Many people know the competition in sports programming, conducted according to the ACM ICPC rules, so for convenience I will compare with them. In ACM ICPC competitions, the decision comes down to writing an algorithm that solves the problem within the specified time and memory limits. You need to understand which algorithm is suitable for solving the problem and encode it as soon as possible. In HPC competitions you don’t have to read fascinating task texts trying to figure out which algorithm is hidden in them. But you will meet with the Linux console, multi-year legacy code of scientific applications and the latest technologies (yes, all together), parallel and multi-threaded programming technologies and task launch planners.
Instead of writing trees and sorting with eyes closed, you will have to master the speedy writing of make files, learn the compilation flags and make friends with vim, ssh and screen.
')
There were no restrictions on the number of participation of the same person as in the ACM ICPC competitions. Now there are 3 such competitions in the world: ASC, ISC, SC. Their main difference is the venue. The rules and set of tasks are very similar. Separately, you can select only Chinese (ASC), who conduct a preliminary qualifying round with problem solving and provide equipment to teams in the final and, for some tasks, also give access to computing resources during the qualifying round. ISC and SC do not have a preliminary selection stage with tasks. Teams are selected on the basis of applications describing the configuration of the cluster that the team is going to use in the competition and the HPC-relevant topics of the team members experience. In all 3 competitions, the prize is awarded to the best Linpack (one of the tests used to determine the computing power of supercomputers) and for the highest score for all tasks.
Now I will tell a little about each of the competitions.
ASC (Asian Supercomputer Challenge). Organized by China, the main sponsor - the company Inspur. In 2012, this competition was held only for the Chinese teams, since 2013 it has reached the international level. Due to the presence of the sponsor and the preliminary selection stage is the most massive. From 2013 to 2017, the number of participating teams increased from 43 to 230. The quota for qualifying for the finals has increased from 10 to 20 places during this time. The first stage (preliminary contest) is held in absentia for two months, from January to March. According to the results of the first stage, teams that are invited to participate in the final at the end of April are selected. The low threshold of entry is determined not by the ease of tasks, but by the fact that you do not need some crazy amount of money and a super fashionable efficient cluster to participate. The team should consist of 5 “non-graduated” students (bachelor’s, masters) and 1 coach. During the finals, the coach cannot be in the team’s workplace.
The ISC is held as part of the student section of the ISC (International Supercomputer Conference) in Germany on the 20th of June. The team should have 6 students and 2 coaches. Acceptance of applications for participation takes place in October-November. There is no preliminary selection stage with problem solving, only an application for participation with a description of the cluster configuration and the experience of the HPC team and University members. The vendor who will provide the equipment the team should look for independently. The organizers can write a letter on behalf of the organizing committee with a proposal to assist the team to the team, but no one provides ready equipment, as is the case with ASC. The tasks of the competition are the launch of benchmarks and several scientific applications.
SC is held in America in November, also in conjunction with the supercomputer conference. The competition is held in the hackathon format and lasts 48 hours. The team also consists of 6 people, but they do not work at the same time - one student is prohibited from being in the competition area for more than 12 hours a day. During this time, the team should run 2 benchmarks (Linpack, HPCG) on pre-assembled and configured clusters, try to reproduce the results of the selected article from last year’s conference and launch 3 more applications, one of which is kept secret before the contest starts. Tasks of this year can be found
here .
And what to do with all this?
Here I will try to describe how the selection takes place, the technologies that will most likely have to be encountered during such competitions and the challenges of the past years. Our team has experience of participating only in ASC competitions, therefore all particular features will be described in detail only in relation to this competition.
Competitions involve working with a computing cluster. Here a subjective moment appears - the success of the team to some extent (strongly or poorly depending on the task) is influenced by iron, which is used for calculations, thus one of the primary tasks is to find a sponsor who is ready to provide equipment (it is beneficial to him ). I also heard that some teams (I think it was someone from the Chinese) use the systems of their University, but this is not about Russia. I can hardly imagine that anyone allowed to take out University equipment to another country. Spoiler: our team does not even have access to the University CC. For training, we use the educational cluster of the department 10 years ago and are constantly looking for any opportunities to gain access to any third-party systems. In the SC and ISC competitions, teams assemble the cluster completely independently, taking into account only the power limit of 3000W. ASC places participants in slightly more equal conditions - a cluster of all participants is assembled on the basis of the same server, which is provided by the main sponsor of the competition by Inspur. Teams can add their own accelerators (video cards, Xeon Phi, FPGA), put the SSD, add memory, but at their own expense (or at the expense of sponsors).
In the application, the team indicates the configuration of its cluster and the background of the participants and their University in the field of high-performance computing. In ASC, this is part of the absentee qualifying round along with problem solving. ISC and SC are simply submitted as an application for full-time participation in the competition.
A stack of technologies and skills that will probably come in handy during the competition: programming languages: C / C ++, Python, Fortran, bash; multithreaded technologies: OpenMP, OpenCl, OpenACC, CUDA, Intel Cilk Plus, Intel MPB, etc.
The ASC qualifying round starts at about the 10th of January and lasts 2 months. During this time, it is proposed to solve 3 problems. As a rule, the goal is to accelerate the provided code (the last time was one task not requiring acceleration). The maximum for the qualifying stage, you can score 100 points, which are distributed as follows:
- 10 points (5 + 5) is given for the presentation of the team and the description of the university / department activities in the field of HPC. The presentation of the team includes the name, slogan, team photo, a story about the team members and the organization of teamwork. The activity description includes the presentation of available computing resources, experience in solving HPC problems, articles on this subject, as presented by HPC at the university (courses, etc.). Honestly, it is not clear what points are awarded for, there is no clear scale (as for all other tasks of the qualifying stage). For the coolness of the team? For the full story? Or just everyone 10 points for the description?
The remaining 90 points are given for completing tasks. Part of the tasks assumes that the participants must run them on their equipment, others are required to run on a remote cluster, the same for all teams. If a team has access to powerful computing resources, then it is worthwhile to rely on such tasks. - 1 task, up to 15 points - assemble the cluster configuration, justifying the use of selected elements. It is understood that this configuration will be used by the team in the final, but in fact it can be changed. It is necessary to collect the solution on the basis of the server provided (in 2017 Inspur NF5280M4), there is also a switch (Infiniband or Ethernet) in the list of available equipment, the corresponding cable and card, until 2016 Xeon Phi was available inclusively. The limiting condition is that the theoretical power of the entire cluster should be within 3000W.
- Task 2, up to 15 points - launch benchmark. Over the years, HPL (Linpack), HPCG (2016), HPCC (2015) (HPL + other tests) were presented. In 2017, the benchmark was necessary to run on the cluster provided by the organizers with KNL processors (2nd generation Intel Xeon Phi), previously, the participants ran benchmarks on the equipment that they had. The task - to squeeze the maximum performance. In the report, it is necessary to describe the cluster configuration on which the benchmark was launched, the results of launches and actions for setting it up.
The 3 and 4 tasks were previously given 20 and 40 points, respectively. In 2017 - 30 points each. - Task 3 - code optimization in order to speed up work on a certain data set. Over the years, these were scientific applications in C, C ++, Fortran.
2013 - Gromacs
2014 - Quantum Espresso Test
2015 - NAMD
2016, 2017 - MASNUM_WAVE
- Task 4 - optimization for Xeon Phi and launch on it. (until 2017). In 2017, the 4th task was devoted to predicting road traffic by the neural network, and it was not the speed of work that was estimated, but the prediction accuracy.
2013 - BSDE option pricing
2014 - 3D-EW
2015 - Gridding (Square Kilometer Array project)
2016 - neural network optimization (Chinese program DNN)
When solving problems, it is worth considering that the Chinese servers provided for some tasks during the Chinese new year in early February will be turned off (during the week). After this, they start a fever and the task can be long enough in the queue for execution. Tasks requiring launch on these servers must be addressed first.
After about a week or two, the results are known. Characteristically, the organizers do not post any results table, and even the teams do not even personally give their scores. The only thing you can focus on is the order of listing commands. According to our assumptions, it corresponds to the rating. In addition to the final teams, a list of teams that showed a good result is also announced.
There are 2 months before the final, what should I do with them? In the final, you will need to build and configure the cluster yourself. It takes 2 days. They bring you a rack, the ordered number of servers, cables and two monitors. Forward!


So, if you have never collected and set up a cluster before - it's time to learn. After that, there are 2 days to launch applications. These will be the same tasks as in the qualifying round, but with different inputs. Also 3 new tasks are added: two become known along with the invitation to the final, another one is issued directly on the day of the competition. On each day of the competition, you need to run 3-4 tasks, the best result is indicated in the final card, which is dealt at the end of the day. On a special screen (and on a web page), the energy consumption of each team by the cluster is indicated. If it exceeds 3000W, then the university logo is highlighted in red and the siren starts to squeal wildly on the whole hall. The result of such a launch team will not be counted.

In contrast to the preliminary stage, the scoring system is prescribed in the rules for the final, and a large table is made on the wall in the hall where the results for each task fit into it.

After the mysterious scoring in the qualifying round, such transparency is extremely pleasing. 7.5 points are given for benchmarks, 15 points for the remaining tasks, 10 points - the team presentation, in the total - 100. In general, points are awarded as follows: the team that showed the best result gets the maximum, the other teams get points based on the ratio of their result to the result leader.
Prizes and rewarding. Cup and a diploma awarded to all teams that have passed to the final. There are awards for 1st and 2nd places, for the best result on Linpack, for the best result for some other individual applications. Moreover, if the best result was achieved by the team that has already received one of the above prizes, then the next one will be awarded. You can also actively promote your team and competition in social networks (Twitter or Chinese WeChat) and get the Best Popularity Prize.

By the way, the amount for the Best Popularity Prize, we still have not come. I hope that transfers inside China are nevertheless more operational and at least the Chinese teams have received their awards.
Due to the fact that all 3 competitions have common roots, teams that took 1 and 2 places in ASC are automatically invited to the competition in Germany.
Conclusion
Competitions are interesting because they provide an opportunity to work with “big computers”, scientific developments and look at another world of IT that is different from a commercial enterprise. I would be glad if this interested someone, and this year new Russian universities will appear in the list of participating teams. Every year the Russian teams put on the blades everyone in the ACM ICPC competitions and, I hope, after a couple of years of training, they can put in HPC :)
Links for those interested:
The blog site of an American journalist covering each of these competitions.Registration for ASC 2018