Why data analysis
The need for data analysis has gone far beyond technological and Internet companies. Methods of machine learning are increasingly being used in completely different areas, up to the optimization of transport routes. With their help, new medicines and cars are created without a driver, music is selected according to mood, there are potential companions of life.
A data analyst or data scientist is one of the most sought-after professions of today. For real practitioners who can get meaningful results in a short time, there is a real struggle, and the cost of such specialists takes off to the skies.
Also interest is fueled by state and commercial structures, which not only talk about these specialties, but are already preparing for the first
Olympiads on them.
')
What is behind these words, do everyone understand their meaning? Unfortunately, they are often treated like a magic ingredient that solves all problems. Neither the boundaries of its application, nor the order of actions to use them "here and now" are realized.
It is time to clarify this issue.

Where can I learn this now
Data analysis is a field of knowledge in which an understanding of “how to teach it” is only being formed. Leading universities create master programs, but, unfortunately, do not always keep up with new approaches and tools.
Therefore, the most appropriate place for learning is high-tech companies, in which working with data is the very essence of business. And one of these companies, without a doubt, is Yandex.
Combining the efforts of leading scientists from MIPT and real practitioners from Yandex, we have prepared a
specialization in machine learning and data analysis that will allow you to master a new profession and take the first steps in this interesting area.
And will teach this:
- Konstantin Vorontsov - Doctor of Physical and Mathematical Sciences, Professor of the Russian Academy of Sciences, Head of the Department of Intellectual Systems, FIC IU RAS, Lecturer at the School
- Vadim Strizhov - Doctor of Physical and Mathematical Sciences, Associate Professor at the Moscow Institute of Physics and Technology, Leading Researcher of the Institute of Information Technologies
- Evgeny Ryabenko - Candidate of Physical and Mathematical Sciences, Associate Professor at the Moscow Institute of Physics and Technology, Lecturer, VMK MSU, ShAD, data scientist Yandex Data Factory
- Evgeny Sokolov - Lecturer, VMK MSU, HSE, SHAD, Head of Yandex Data Factory Research Group
- Victor Kantor - senior teacher of the FIFT MIPT, teacher of the SAS, head of the Yandex Data Factory research group
- Emily Dral - teacher of the FIFT MFTI, PFUR, data scientist Yandex Data Factory
How is the course and the course
Our specialization consists of five courses and a final project.
- In the first course we will tell the basic facts from mathematics, without which it is difficult to understand anything in data analysis, and teach you how to program in Python.
- In the second, we will study on marked up data or on learning with a teacher - let's see how to build predictive models using a set of examples and evaluate their quality.
- In the third course we will talk about searching for a structure in the data: how to do clustering, how to lower the data dimension and look for anomalies.
- The fourth course is devoted to the art of turning data into conclusions: we will master the methods of statistical analysis and planning of experiments.
- In the fifth year, we will examine in detail several large typical data analysis tasks, such as time series prediction or text analysis.
We tried to make each of the courses extremely saturated and concise - so that a person could master it at an average pace per month. Thus, it will take half a year to complete the entire specialization. However, the real speed depends only on the motivation and perseverance of the listener!
For the course were selected only those techniques and tools that work well in practice and are used by real researchers in their daily work. Many of the data with which you will have to work during the course are taken from real projects - this is the only way to understand and feel, “as it happens in reality”.
The final project will apply the knowledge gained to the real data of one of the practical areas: e-commerce, social media, information retrieval, business analyst, etc. her qualities As a result, a project will appear in the portfolio, which can be safely indicated in the resume and shown to the employer at the interview.
As in all courses on the Coursera platform, the basis is video materials, which alternate with different types of activities - from tests that allow you to test knowledge and understanding, to programmable tasks with automatic testing and tasks for mutual evaluation.
Understanding that students can begin training with a very different background, we have made the first course of specialization an introductory, decisive two tasks. First, it helps refresh knowledge of the basic mathematical concepts that we need in the future. And secondly, to get basic skills in working with the Python language and special libraries for data analysis.
To preserve the practical orientation of the specialization, and not to drown in formalism, even in the first course, many concepts are introduced “on the fingers”, with an emphasis on intuitive understanding. Adherents of mathematical formalism can still turn to online courses from the
department of discrete mathematics on Coursera or the
national open education platform . Also, very soon, full rigorous courses in mathematical analysis, linear algebra and differential calculus will appear on the
LIPT Lectory .
The ideal goal of our entire project is to ensure that students can be interviewed for the position of data scientist (a level appropriate to their professional experience). At the same time, our audience does not necessarily want to change work, it will be enough for someone to update the methodological tools and get more effective solutions to their standard work tasks. But, in any case, the graduate specialization must meet the entrance standards of our profession.
When and how to start
Specialization is already available and training in her
first year will begin on February 9. As for the other specializations on the Coursera platform, there is a prerequisite for ours from the platform: if the listener wants to complete the entire specialization and be able to complete the graduation project, all the courses must be completed in the identity verification mode.
Most of the materials of specialization courses are available for free, but a number of tasks, the passage of which is necessary for obtaining a certificate, are marked with a “lock” and are available only after payment. If the student wants to get access to all the tasks and the certificate, but you do not have the opportunity to pay for them, you can use the program of financial support (Coursera Financial Aid). To receive help from Coursera, he must complete a brief application describing his financial situation and the reason for enrolling in the course. A very similar practice is used when requesting financial assistance for admission to American universities. Last year more than 100 thousand applications were approved. To apply, you need to follow the link under the button “Register” on the page of specialization.
Forward to new knowledge - start learning !
PS For those who would like to receive tutor support during the course and personal exams with obtaining a state certificate of professional retraining, we are working on a special program. If you are interested, we suggest filling out a short
form.UPD : added information about the content of specialization courses and its objectives
UPD 2 : full-time specialization is available to MIPT students. To activate this feature you need to write to the mail mooc@phystech.edu.