At the end of May, our
programming championship will begin. It will be held online and will allow you to test yourself in one of four areas: backend or front-end development, machine learning or data analytics. The tasks for the sections were developed in the management of machine intelligence and research, Search and geo-services.

All participants must first overcome the qualifying round. After submitting the application, you will choose when to complete it. Qualification 4-hour and includes from 4 to 6 tasks. We will invite the best to participate in the final, which will be held on June 1, also online. Results will be announced on June 5th. The winners in each direction will receive 300 thousand rubles, the second place - 150 thousand rubles, the third - 100 thousand. Registration is
open and will last until the last day of the qualifying round - May 26, but it is better to send an application early.
')
In this post we will share the experience of holding such contests - in terms of the audience and the preparation of complex algorithmic tasks.
* * *
The championship is a development of the idea that we implemented in 2017–2018 in the Yandex Blitz series. The difference is that the Blitz was just a series of individual competitions in different directions. They were united only by the format, and they passed at completely different times. Be sure to read habrastasti with the analysis of tasks for each competition:
machine learning ,
backend ,
frontend and
mobile development .
During the preparation of the championship and this post, we communicated a lot with those who took a high place in Blitz and then got into Yandex. It was important to take into account the real experience of the guys, the view from the participant in order to make the competition even more transparent and interesting.
Why is it worth participating
The Championship, like the last Blitz, is a short way to the company: participants from the top will be able to come to us to work under a simplified interview scheme. But we are waiting far from only those who are looking for work and are considering Yandex. We expect that representatives of two more categories of developers will join the contests. The first category is those who are interested in algorithms, are engaged in sports programming, and also participate a lot (or participated) in competitions and other competitions. We will offer such people worthy tasks and interesting experience to the piggy bank.
The second category is experienced programmers and analysts. They will have the opportunity to demonstrate their experience and background. The fact is that we have made very diverse tasks. This distinguishes the championship from the Kaggle contests - not for the better and not for the worse, just Kaggle offers several other possibilities. There, compilers usually give such conditions and data that allow you to test yourself in a particular area (if you wish, participants have time to study it). Rounds of our championship take place in a matter of hours and record current knowledge. You may not understand, for example, in voice technologies or computer vision, but to show such thinking that in the future will allow you to quickly plunge into any topic. Of course, the comparison with Kaggle is relevant only for the ML track of the championship.
Production-like
So, the main idea remains the same: to offer participants tasks that are close to combat - those that Yandex developers and analysts really face. So you can understand the level and specificity of these tasks, see what issues you will have to face in your work if you work for a company. In addition, the tasks that we have compiled for the contest will help participants to assess how well they have leaked in specific areas, whether they have ideas that can really be converted into improving services and applications.
Those who took part in the Blitz 2017 and 2018 saw that the tasks were partly dictated by their source codes from the combat projects. But the combat implications of development in a corporation often consist in the need to understand algorithms - even in such seemingly distant from algorithms fields as frontend and mobile development. So the contests dedicated to these two topics were often rated by the participants themselves as close to combat. But the other two contests — on algorithmic programming and machine learning — would require an understanding of the algorithms, even without any “combat” subtext. There was such an implication in them too, but it was not always possible to discern it according to the conditions of the tasks. However, the participants did not interfere with the competition, but for us - to realize the main idea of ​​Blitz.
Task ideas
When the tasks for the competition on sports programming are not composed by themselves, but on the basis of the tasks that actually arise in the services, the process of compiling them is completely different. The reason is that in services a manager or a colleague brings the task to the developer in a different formulation, in a different context than when the condition comes from the organizers of the competition to their participants. A full-time programmer or even an intern, especially if he worked at the company for a while, is much more deeply immersed in the processes of his department than an external (even very talented) developer. They cannot formulate the problem in the same way, especially since the bidder is required to come up with a solution in much less time. The development environment is different too: there is only an input and output file, and the employee works in the repository, in the internal interface, with all the tools, etc.
"Cleaning" conditions
So we took the tasks from the military environment, but then we always asked ourselves whether their participants would understand? Sometimes it turned out: to make the condition understandable for a wide audience of developers, you need to write a big preamble to it, introduce terminology with which a specialist in the company has been familiar with for a long time, etc. Such an approach would not always work: in a competition it is important that the condition be capacious so that it can be quickly read and move on to the most important thing - to develop a solution. Therefore, in cases where, together with the preamble, the condition would become too cumbersome, we tried to reformulate it and eliminate the need for the preamble. Another formulation was often required because the original task contained internal information from Yandex, which cannot be disclosed outside the company. As a result, the task could become more abstract, no longer so similar to the counterparts in production.
Interestingly, the opposite situation — when the condition at once was able to be formulated succinctly and not to lose proximity to production — often led to the task being complex. For example, this was evident in the Blitz finale of machine learning, in problems related to image recognition. This year's championship is no exception. Participants, among other things, are waiting for the tasks about machine translation - capaciously formulated, difficult to implement and really taken from the combat project (Yandex.Translate).
What we check
The question arises - making the task abstract in comparison with production, do we not simplify it? In a certain way, yes, we simplify - for its solution, experience with the internal infrastructure of Yandex is no longer required, as well as preliminary communication with colleagues. You do not need to be familiar with the code review process, you do not need to make the code beautiful, etc. But we save the most informative part of each task that requires algorithmic thinking. And if you decide it, even if in a somewhat simplified form, it will still mean that you are an excellent programmer. And a great programmer will quickly get acquainted with the internal infrastructure, delve into the code review process and rebuild from the sports mode of writing code to industrial. It's like in basketball: the main thing for the player is the size and good understanding of the game, and you can teach the throw.
We mentioned algorithmic thinking - in the sense that you need to be able to implement the desired algorithm using the means of your chosen language: without additional libraries. Most likely, in real work (both before and after the competition) you will use various additional libraries that simply call the necessary algorithms and greatly reduce the amount of code. The ability to connect them - just from the category of what "can be taught." It is more interesting for us to make sure that when you call a library, you understand what and how it does. Knowing the algorithms from the inside, you will use them more effectively - without the need to implement them yourself.
Analytics competition
Talking about the tasks of the championship in this post, we often recall Yandex. Blitz. But now participants can choose the direction of data analytics, in which we did not carry out a Blitz. This is a new track with its own specifics. If you choose it, then the knowledge of the algorithms will also be a plus, but to a lesser extent than in the tracks on machine learning or backend development.
The general idea here is the same as in other areas: check the skills that are used by experts in Yandex. Therefore, the question is - what skills can be useful?
The key skills of a good analyst in Yandex are the ability to generate hypotheses, extract a useful signal from fuzzy task conditions, ambiguous or noisy data. Our analysts, as a rule, write in Python and work with large data streams, for example, with Yandex.Metrica logs, user sessions, server technical logs, etc.
For solving analytical problems within the framework of the championship, as well as for further work in Yandex, it is very useful to know the basics of mathematical statistics and probability theory. This is the basic knowledge that helps to make correct, data-based conclusions about the processes.