
On May 25, the
second Metaconf meeting will take place in Voronezh, this time dedicated to machine learning. In the program of mitap - five reports, free registration is available
here . In particular, Anton Dolgikh, DataArt expert on AI-projects in the field of health care, will speak about the “Neural Network Probabilistic Model of a Natural Language”. Today we asked Anton to tell about the experience of systematization of knowledge on machine learning inside DataArt.
The scope of ML is constantly expanding (from healthcare to travel industry). Inside DataArt, at some point, the number of requests for development in the ML domain exceeded a critical value. Prior to that, we were able to solve such problems by the engineers who worked in the company.
')
When it became difficult to manage our own resources, two ways of development were designated: to hire new employees or to train specialists within the company. In the first case, we face the risk that the ML developer hired by us will not immediately fall into a new project from his professional field. At the same time, people who are narrowly involved in machine learning are usually not ready to engage in, for example, fullstack development. Therefore, we have relied on DataArt engineers who are interested in developing towards ML, but are able to return to their previous work if necessary.
The preparation process must be systematized. It may seem that the Internet is filled with a mass of online and video courses. But in order to develop productively, a person needs a development vector - from the random listening of any courses, there is little benefit.
What did we do:- First of all, they formed a core - an initiative group of colleagues with the greatest experience and expertise in various areas of machine learning. They prepared a number of presentations, made an overview of existing courses and collected recommendations: which courses need to be taken in order to acquire the skills relevant to the tasks that DataArt solves.
- We organized math courses. Obviously, ML is essentially mathematical statistics and optimization methods. To understand and correctly use machine learning methods, certain mathematical knowledge is necessary. At first glance, specialists who have received a technical education are always well aware of mathematics. But practice turns out that skills are forgotten very quickly. This imposes limitations on the course: a company, unlike a university, cannot provide fundamental knowledge, however, knowledge must correspond to the tasks and be sufficiently deep. We invited a lecturer from the side to read the course (our colleagues were too busy). The program focused on areas of direct relevance to machine learning: linear algebra, analysis, probability theory, optimization methods. The course is supplemented by regular classes with experts, where, on the basis of theory, we consider practical tasks from machine learning projects.
- Our specialists ML-areas every month conduct educational seminars on the latest achievements in this field. Record of seminars is available to all employees of the company.
- In addition to seminars, ML DataArt specialists regularly release a digest of interesting materials (methods, articles, books) with brief annotations and comments.
The company supports these initiatives, a budget is allocated for the purchase of literature and the participation of colleagues in conferences, for iron and mentor programs. The result of individual training according to the mentoring program is a ready-made prototype that can be used at conferences or at meetings with potential customers. An example is the result of the work of our expert Andrei Sorokin, a model that detects and classifies skin lesions (
arxiv.org/pdf/1807.05979.pdf ). To optimize the resulting model for use on mobile devices, an employee helped in the mentoring program. The model took the 12th place in the international competition
ISIC 2018 , beating not only individual participants, but also university teams.
The above systematization of the process allowed us to quickly and efficiently handle all requests from the field of machine learning, received by DataArt from potential customers. We have prepared marketing materials, and sales teams are always available experts who can answer customer questions. Several projects have already been successfully completed.
Like many large technology companies, DataArt scales expertise and educational programs for external audiences. On May 25, Voronezh hosts an open
Machine Learning meetup , which participants learn about trends in ML technologies, problems and tasks that can be solved with their help, real projects that use machine learning methods and artificial intelligence.