September 6, 2017 starts the 2nd launch of the OpenDataScience open course on data analysis and machine learning. This time there will be live lectures as well, the Moscow office of Mail.Ru Group will be the venue.
In short, the course consists of a series of articles on Habré ( here is the first), reproducible materials (Jupyter notebooks, here is the github repository of the course), homework assignments, Kaggle Inclass competitions, tutorials, and individual data analysis projects. Here you can sign up for a course, and here you can join the OpenDataScience community, where all communication will take place during the course (channel #mlcourse_open on Slack ODS). And if in more detail, then this is for you under the cat.
The purpose of the course is to help quickly refresh your knowledge and find topics for further study. The course is unlikely to fit exactly as the first on this topic. We did not set ourselves the task of creating a comprehensive course in data analysis and machine learning, but we wanted to create a course with the perfect combination of theory and practice. Therefore, the algorithms are explained in sufficient detail with mathematics, and practical skills are supported by homework, competitions and individual projects.
A big plus of this particular course is the active life on the forum (Slack of the OpenDataScience community). In a nutshell, OpenDataScience is the largest Russian-speaking community of DataScientists, which does a lot of cool things, including organizing Data Fest . At the same time, the community actively lives in Slack, where any participant can find answers to their DS questions, find like-minded people and colleagues for projects, find a job, etc. For an open course, a separate channel has been created, in which 3-4 hundreds of people studying the same as you will help in mastering new topics.
Choosing the format of the material, we stopped on the articles on Habré and Jupyter notebooks. Now the live lectures and their videos will be added.
Prerequisites: you need to know mathematics (linear algebra, analytic geometry, mathematical analysis, probability theory and mathematical statistics) at the level of 2 courses of a technical college. You need to be able to program a little in Python.
If you do not have enough knowledge or skills, then in the first article of the series we describe how to repeat math and refresh (or acquire) the skills of programming in Python.
Yes, good knowledge of English and good sense of humor still do not hurt.
We have relied on Habr and the submission of material in the form of an article. So you can at any time quickly and easily find the right part of the material. The articles are ready, in September-November they will be partially updated, and another article about the gradient boost will be added.
List of articles series:
Lectures will be held in the Moscow office of the Mail.Ru Group on Wednesdays from 19.00 to 22.00, from September 6 to November 8. At the lectures there will be an analysis of the theory as a whole along the same plan as described in the article. But there will also be a live analysis of tasks by lecturers, and the last hour of each lecture will be devoted to practice - the students themselves will analyze the data (yes, write the code directly), and the lecturers will help them with this. Attend a lecture will be the top 30 participants of the course on the current rating. The rating will be influenced by homework, competitions and data analysis projects. Lecture broadcasts will also be organized.
Lecturers:
If you wish, you can read about all the authors of the course articles here .
Each of the 10 topics is accompanied by homework for which 1 week is given. The task is in the form of a Jupyter notebook, in which you need to add a code and, based on this, select the correct answer in the form of Google. Homework is the first thing that will affect the rating of course participants and, accordingly, who can attend live lectures.
You can now see 10 homework assignments with solutions in the course repository . In the new launch of the course homework will be new.
One of the creative tasks during the course is to choose a topic from the field of data analysis and machine learning and write a tutorial on it. With examples of how it was, you can get acquainted here . The experience was successful; the participants themselves wrote several very good articles on topics that were not covered in the course.
Of course, without practice in data analysis, nowhere, and it is precisely in competitions that one can very quickly learn something and learn to do it. In addition, the motivation in the form of various buns (money and rating in the "big" Kaggle and simply in the form of a rating we have in the course) contribute to a very active study of new methods and algorithms during the data analysis competition. In the first launch of the course, two competitions were offered in which very interesting tasks were solved:
From the public Vkontakte "Memes about machine learning for adult men."
The course is designed for 2.5 months, and a lot of activities are planned. But be sure to consider the possibility to carry out your own data analysis project, from beginning to end, according to the plan proposed by the teachers, but with your own data. Projects can be discussed with colleagues, and at the end of the course a peer-review check of projects will be arranged.
Details about the projects will be later, but for now you might think, what data would you take to "predict something for them". But if there are no ideas, not scary, we will advise some interesting tasks and data for analysis, and they can be different in terms of complexity.
To participate in the course, fill out this survey, as well as join the OpenDataScience community (in the column “Where did you learn about OpenDataScience?” Answer “mlcourse_open”). Basically, communication during the course will take place in the Slack OpenDataScience channel #mlcourse_open.
The first launch took place from February to June 2017, about a thousand people signed up, 520 did the first homework, and the last - 150 people. Life on the forum simply boiled, several thousand packages were made in Kaggle competitions, the participants of the course wrote about a dozen tutorials. And, judging by the reviews, we got an excellent experience, with the help of which you can further plunge into neural networks, Kaggle competitions or machine learning theory.
A bonus for the top 100 finalists of the course was a mitap in the Moscow office of the Mail.Ru Group, which had 3 lectures on topical issues in modern DS:
Last but not least, we’ll be happy: from mid-November 2017, right after the introductory course on machine learning, in the #mlcourse_open channel at Slack ODS, we will go through one of the best courses on neural networks - Stanford cs231n course “Convolutional Neural Networks for Visual Recognition. ”
Good luck to you in learning this beautiful discipline - machine learning! And these two comrades here are for motivation.
Andrew Ng interviews Andrej Karpathy as part of Deep Learning.
Source: https://habr.com/ru/post/334960/
All Articles