📜 ⬆️ ⬇️

Four paths from the Yandex Data Analysis School

Yandex has been preparing data science specialists since 2007. Students appreciate the School of Data Analysis for the relevance of curricula and courses, but they do not always understand what awaits them after graduation. Working with data in Yandex or in another large company? But which one?



Initially, the School had two departments: computer science and data analysis. In 2014, when big data came into vogue, a third specialization appeared - big data. This year, in order for the students to immediately become clearer in their perspectives, we carried out a department reform: now the training will take place within the framework of four professional tracks. Our primary task is to tell the student about possible ways of development and help us understand which courses will help in achieving the goal.
')
Professional tracks are not highlighted randomly - these are the four paths that graduates most often take after completing the ShAD (and some already during their studies). For each of these four paths, we found one graduate who chose him, and talked with them to understand which courses turned out to be the most useful for future work and how they chose their professional vocation.

Data scientist (Nikita Popov, graduate of 2016):

“A data scientist is what analysts of all stripes now call. We at Yandex are used to thinking that a data scientist is a person who is fluent in machine learning and statistics and, most importantly, in practice can extract useful information from a huge amount of data.

I am currently working on the Search metrics team. We are working to evaluate the quality of our search, choose which way to go and which of the many experiments we are doing will really increase the user's “happiness.” I got to the team through an internship immediately after the end of the ShAD. The school of data analysis gave me an excellent base: courses in machine learning and probabilistic models - this is exactly what I use every working day.

When I arrived at the ShAD, I still did not understand what I wanted to do, and I entered the company with my classmates, but from the first seminars it became clear that the ShAD is extremely interesting. It was there that I understood what I wanted to do. I think that every data scientist should be well versed in various methods of machine learning, be aware of their pros, cons and scope, be able to find dependencies in the data and make the right conclusions based on them. Despite the fact that I work as an analyst, very often I have to deal with development. Recently, I finished the service for which I developed both the frontend, and the backend, and the algorithms themselves - a data scientist should be able to do everything. ”

Machine learning developer (Zhenya Zakharov, a graduate of 2018):

“Even at the university, I liked the tasks most of all, where mathematics plays a significant role, but the result can be“ touched by hands ”. My current work quite well meets these two conditions: we implement various algorithms, modifying in passing so that they work faster, higher, stronger with our data. One of the key indicators for us is productivity. There is a lot of data, and the algorithm should be able to quickly predict and learn in a reasonable time.

I had a lot of programming at the university, but the ShAD courses differ in algorithmically more complex tasks, with a greater emphasis on performance and cleanliness of the code.

ShAD gave me a good set of basic skills that I use every day: machine learning in its various guises, applied statistics, algorithms, and an idea of ​​how an industrial code should look. The project of the big data course turned out to be very relevant, where we and the guys in the team wrote a gradient boosting, trying to catch up with LigthGBM speed, which we did not catch up with, but still managed to achieve comparable time. ”

Big Data Infrastructure Specialist (Vlad Bidzilha, 2017 graduate):

“From high school I wanted to be professionally engaged in programming. I entered the ShAD when I was in the third year of university. He opened before me a marvelous new world of machine learning and data mining, high-performance systems with a bunch of algorithms at the interface of applied mathematics and programming.

For several years, I worked at Yandex in the quality team for video search ranking. ShAD Advanced C ++ and Python courses helped me quickly get involved in the workflow - move from writing academic programs at the university to serious production code at the company.

Recently, I have been working in a distributed computing technology service. We are developing the YR MapReduce system: habr.com/company/yandex/blog/311104 . Here, the knowledge and skills acquired in the SAD were also extremely useful: a course on classical algorithms and data structures instilled an algorithmic culture, developed the ability to quickly write efficient and clean code with a minimum number of bugs and an understandable structure, to understand complex algorithmic solutions; A course on algorithms for working with large volumes of data demonstrated the difficulties encountered when processing a data array that does not fit in computer memory, and methods of dealing with these difficulties, provided an understanding of the basic patterns of building algorithms in external memory and streaming (streaming) algorithms, developed basic practical writing skills; The course on parallel and distributed computing introduced the main constructs of multi-threaded and distributed programming, applied everywhere and everywhere in the developed system.

In addition, it is worth noting that thanks to the SJA I managed to get deeply acquainted with applied mathematical courses, which often remain outside the classical university program: information theory and computational complexity, advanced discrete mathematics, statistical analysis, combinatorial and convex optimization. This knowledge combines theoretical mathematics and high-tech IT industry. ”

Data Analysis Specialist in Applied Sciences (Nikita Kazeev, graduate of 2015):

“I am working on the application of machine learning methods for the problems of fundamental physics at CERN in the status of a postgraduate student at the HFC and Sapienza University of Rome.

He took a great interest in physics from school, was a prize-winner of the All-Russian Olympiad, went to the FOPF MIPT. Largely due to idealistic considerations - if you are not engaged in science, then what? But to computers always. Bachelor’s work was devoted to computer modeling of non-ideal plasma, and there were many algorithms and C ++ in it.

In the fourth year I entered the ShAD, a year later I was invited to the emerging group of international educational and research projects in Yandex. Now it has transformed into a joint laboratory of Yandex and HSE - LAMBDA. We not only do something with our hands, but also teach machine learning physicists, so I taught Oxford in some way. At our summer school, but still;)

What of the Shad come in handy? Many things.


In general, you will be in Geneva, come to visit, it is interesting here :) ”.

Source: https://habr.com/ru/post/422761/


All Articles