📜 ⬆️ ⬇️

How I organized machine learning trainings at NSU

My name is Sasha and I love machine learning as well as people learning. Now I supervise educational programs at the Computer Science Center and I supervise the undergraduate degree in data analysis at St. Petersburg State University. Prior to that, he worked as an analyst at Yandex, and even earlier - as a scientist: he was engaged in mathematical modeling at ICT SB RAS.

In this post I want to tell you what came out of the idea of ​​starting trainings on machine learning for students, graduates of the Novosibirsk State University and everyone.

image

')
I have long wanted to organize a special course on preparing for competitions on data analysis on Kaggle and other platforms. It seemed like a great idea:


Launch


Novosibirsk’s Akademgorodok has very fertile ground for such undertakings: students, graduates and teachers of the Computer Science center and strong technical faculties, for example, FIT, MMF, FF, strong support from the NSU administration, an active ODS community, experienced engineers and analysts from various IT companies. Around that time, we learned about the grant program from Botan Investments - the foundation supports teams that show good results in ML sports competitions.

We found an audience at the NSU for weekly meetings, created chat in Telegram, and since October 1 we launched together with students and alumni of the CS Center. The first lesson came 19 people. Six of them became regular participants in training. In total for the academic year, 31 people came to the meeting at least once.

First results


We met with the guys, exchanged experiences, discussed competitions and a rough plan for the future. We quickly realized that the struggle for data analysis competitions is a regular hard work, similar to unpaid full-time work, but very interesting and exciting :) One of the participants, Kaggle-master Maxim, advised us to advance first in competitions individually , and only a few weeks after that, unite in teams, taking into account the public score. We did just that! At full-time trainings, models, research papers and subtleties of the Python libraries were discussed, together they solved problems.

The results of the fall semester were three silver in two competitions at Kaggle: TGS Salt Identification and PLAsTiCC Astronomical Classification . And one third place in the CFT contest for correcting typos with the first money won (in the money, as experienced kegglers say).

Another very important indirect result of the special course was the launch and configuration of the VKI NSU cluster. Its computing power has noticeably improved our competitive life: 40 CPU, 755Gb RAM, 8 GPU NVIDIA Tesla V100.

image

Before that, we survived as best we could: counted on personal laptops and desktops, on Google Colab, and on Kaggle-kernels. One team even had a samopisny script that automatically saved the model and restarted the calculation, which stopped at a time limit.

In the spring semester, we continued to gather, exchange successful finds and talk about our competition decisions. New interested participants began to come to us. For the spring semester, it turned out to take one gold, three silver and nine bronzes in eight competitions for Kaggle: PetFinder , Santander , Gender pronoun resolution , Whale Identification , Quora , Google Landmarks and others, bronze in Recco challenge , third place in Changellenge >> Cup and first place (again in the money) in the machine learning competition at the Yandex programming championship .

What the training participants say


Mikhail Karchevsky
“I am very pleased that such activities are conducted in Siberia, because I believe that participation in competitions is the fastest way to master ML. For such contests, iron is quite expensive to buy on your own, and here you can try ideas for free. ”

Cyril Broadt
“Before the appearance of ml-training sessions, I didn’t particularly participate in competitions with the exception of training and Hindu competitions: I didn’t see the point, because I had a job in the field of MO, and I know him. The first semester went as a listener. And starting from the second semester, as soon as computing resources appeared, I thought, why not participate. And it sucked me. The task, data and metrics for you were invented and prepared, take it and use all the power of the MoD, check the state-of-the-art models and techniques. If it were not for training and, equally important, computing resources, I would not soon have started to participate. ”

Andrey Shevelev
“Full-time ML-training helped me find like-minded people, with whom I managed to deepen my knowledge in the field of machine learning and data analysis. It is also a great option for those who do not have much free time for self-analysis and immersion in the topic of contests, but I still want to be in the subject. ”

join us


Competitions on Kaggle and other sites hone practical skills and are quickly converted into interesting work in the field of data science. People who have taken part in difficult competition together often become colleagues and continue to successfully solve work tasks. This happened with us: Mikhail Karchevsky together with a friend from the team went to work in one company on a recommendation system.

Over time, we plan to expand this activity with scientific publications and participation in machine learning conferences. Join us as participants or experts in Novosibirsk - write to me or Cyril . Organize similar workouts in your cities and universities.

Here is a small cheat sheet that will help you take the first steps:

  1. Think of a convenient place and time of regular classes. Optimally - 1-2 times a week.
  2. Write to potential attendees about the first meeting. First of all, these are students of technical universities, ODS participants.
  3. Start a chat for discussing current affairs: Telegram, VK, WhatsApp or any other instant messenger convenient to most.
  4. Maintain a publicly available lesson plan, list of competitions and participants, and monitor the results.
  5. At nearby universities, research institutes or companies, find free computing power or grants for them.
  6. PROFIT!

Source: https://habr.com/ru/post/458042/


All Articles