On the results of the MERC-2017 contest: an interview with the winners

The winners and finalists of the MERC-2017 competition from the Neurodata Lab at the site Datacombats are not just impersonal lines of the leaderboard. These are young specialists with different professional and research interests, background and competences. As the final touch in the history of our very first contest, we decided to interview them. We hope that for readers of the blog this material will become a source for reflection, as it appeared for us as the organizers of the contest.

At once we want to note that we had unforeseen changes in the composition of the finalists: the first and second places remained the same, and the participant for the third place became the winner of the fourth position on the leaderboard. The participant, who initially took the 3rd place, did not provide the necessary data for the payment of the prize in the terms specified by the rules. Yes, yes, we were surprised too.

So, our final TOP-3:

1st place - (tEarth) Denis Vorotyntsev
2 place - (10011000) Pavel Ostyakov, Alexey Kharlamov
')
• Alexey: writing a basic pipeline, hypothesizing, testing implementations of recurrent layers and frameworks, selecting hyperparameters and learning level 2 models.
• Pavel: optimization of the pipeline, training models, testing approaches to data processing.

3rd place - (FedotovD) Dmitry Fedotov

We asked the winners about their education, future plans, the specifics of their decisions, and also asked to share ideas for new competitions.

1. Please tell us about your education, what influenced your choice of high school and specialty? How did you come to the field of machine learning and data analysis?

Denis Vorotyntsev (MEI, Lappeenranta University of Technology) : in my alma mater, Moscow Energy Institute, a program of cooperation with a number of educational organizations, including the Lappeenranta University of Technology, has been operating for a long time. The best students are selected to study in Finland in related disciplines and to write a diploma. In my opinion, this practice is good both for the student in terms of new impressions, acquaintances and knowledge, and for the university in terms of international cooperation and exchange of experience in the field of education and science. This is especially important for breakthrough areas like renewable energy, where there is still room for discovery and improvement.
I chose a higher education institution based on the desire to solve interesting problems at the intersection of science and business. Training of personnel in the field of renewable energy at the Moscow Power Engineering Institute 6 years ago was the best in the country, with time the quality only increased. My department, the department of hydropower and renewable energy, provided opportunities for scientific work and participation in technical competitions and conferences. At one of these conferences, I heard a report on the work done in the area of forecasting photovoltaic power generation for the week ahead and how important it is for the station owner and for the entire power grid as a whole. The solution presented was quite simple, but even it provided good accuracy. At that time, it seemed to me that I could do better, although my knowledge was limited to a couple of articles on Habré. He began to study articles, books, watch courses on the cursor, edx, joined the Open Data Science (ODS) community. A month later, he received the first “machine learning” model - a linear regression, written entirely in a lablaade! Haha
After that, he participated in a number of machine learning contests on Kagla and similar platforms. Mediocre results motivated to engage in more and more. From each contest, I endured something new to use it in the next contest: in order to compete with the leaderboard top, I had to switch from Matlab to R, and then to Python, try new libraries and technology stacks, watch machine-training sessions and online courses , read books and articles. As they say, to stand still, you need to quickly run forward.

Pavel / Aleksey (Faculty of Computer Science, National Research University Higher School of Economics) : The HSE is an innovative university, it has a great combination of technology and old teaching methods that have a good effect. It is also worth noting the increased mobility, that is, the opportunity to study abroad, as well as the possibility of obtaining additional non-core education (major).

Dmitry (SibSU, Ulm University) : I am currently a graduate student at SibSU (Siberian State University of Science and Technology named after Academician MF Reshetnev, Krasnoyarsk) and I am working on my PhD thesis together with Ulm University (where I am now research grant). Specialty in the Russian university - system analysis; in German - communication technology: interactive systems. I chose the direction in Krasnoyarsk based on my own preferences, professional trends and the advice of friends. Starting from the second course, I conducted research in the field of evolutionary algorithms and neural network technologies, applying the knowledge gained to practical data analysis tasks. In 2015, he began working on the task of recognizing emotions in cooperation with the research group of the University of Ulm, with which, at that time, close cooperation was established.

2. How did you find out about our competition? What attracted you to him?

Denis : I read about the contest in ODS. The description of the competition sounded like this: recognition of emotions by video. Then it seemed to me very difficult - you need to come up with a long pipeline: video processing, face detection, selecting features from people, then building a model. You could learn a lot of interesting things by working on such a difficult task! Honestly, I was a little upset when I saw impersonal variables in the training data, and not a set of video files. However, there was still room for creative work. I am pleased with the result, because I solved an interesting problem and learned a lot of new things.

Pavel / Aleksey : learned from mltrainings.ru. Attracted an interesting challenge with a non-standard setting, the opportunity to work simultaneously with several modalities, to compete with other participants on the new platform.

Dmitry : I found out by looking at information about the AVEC (Audio / Visual Emotion Challenge) contest. In the competition, I was attracted by a very close topic and an urgent task in the form of developing a system that would be able to work without problems with several unstable accessible modalities and with data having different sampling steps.

3. Are there any approaches / ideas that you did not have time to work out during the competition? Do you think it is possible to improve the result?

Denis : I did not have time to try everything I wanted partly because of circumstances, partly because of laziness, partly because of the good position on the public leaderboard. Solving a similar problem for a business, I definitely would try LTSM in conjunction with PCA (or another method of reducing the dimension, for example, selecting important features using RandomForest); convolutional networks. Obviously, stacking several different models (and different approaches to data processing) would increase accuracy, but would complicate the use of the model in production.

Pavel / Aleksey : the result obtained can certainly be improved. You can try different recurrent cells and architectures. Also, it was possible to try to make a model of the second level, and not just to average predictions by probabilities. This idea is good, but in the face of lack of data, you would have to spend enough time to overcome retraining.

Dmitry : fundamentally different approaches that I would like to work out within the framework of the competition, but did not have time - no. I am sure that the result can be improved, and I will certainly work in this direction. Experiments with multimodal recognition of emotions with missing data will be part of my thesis and the database provided by the organizers will be used as the basis for this.

4. Why little attention was paid to the problem of gaps in the data and what other methods would you try to solve it?

Denis (replacement by average value) : at the beginning of the competition I considered two approaches: the first was that we submit all data to a single data frame and train the model with gaps or with a replacement by a certain value (the average gave the best result for cross-validation) .
The second approach was that for each modality we make our own model, and then we combine all the models with a second-level model. It seemed to me that the logistic regression would be the best second-level model. However, one would have to build not one, but several models (based on the possible combinations of omissions, in the theory of such models there should be 2 ^ 4 - 1 = 15). It seemed to me not an elegant solution, so I put it down on the back burner.
The first method gave a good position on the leaderboard, so I did not try the second approach.

Pavel / Alexey (replacement with zeros) : during the creation of the solution, several basic approaches were tried (such as filling with average, higher, lower and others), but filling with zeros turned out to be optimal for our models as for speed. and the final quality.

Dmitry (replacing with zeros) : I devoted a lot of time to the problem of replacing and processing the missing data and spent about half of my submit. The choice of such a trivial and uninteresting method is explained by the fact that all the other options, despite their obvious benefit and logic, did not give an increase in accuracy. Most of the options for accounting for missing data were applied at the stage of post-processing of data, after received predictions for each modality. For example, when calculating the final prediction, only modalities were taken into account for which no data were available. Weighted prediction was also applied, depending on what percentage of frames used were not zero. For example, when using 15 previous frames in the LSTM model, the first 5 frames are zero. In this case, the prediction will be assigned a weight of 0.66 when calculating the final solution. A method of training models was tested only on non-zero data or using weights (similar to the variant with post-processing). None of the options described and close to them showed an improvement in the public part of the test set. Despite this, I am sure that the use of such processing options can improve the accuracy of the model with missing data, and I will continue to search for a working method on the provided database.

5. We were surprised that all participants in the TOP-3 made a choice in favor of the so-called decision trees (XGBoost, etc.) or LSTM. What determined your choice?

Denis (LightXGB) : we can say that gradient boosting methods are a modern “golden hammer” of machine learning like SVM several years ago. They are easy to use out of the box, compared to neural networks, you don’t need to think about data preprocessing: filling in missing values and working with categorical variables. However, as they say, there is no free lunch, in order to achieve outstanding results, you need to try many options. In this competition, the maximum result was associated more with the strategy of data preprocessing and post-processing of the results (smoothing the predicted probabilities over time), rather than with the chosen learning model.

Pavel / Alexey (LSTM-models ensemble) : chose recurrent neural networks, because it is the most powerful tool for working with data sequences of similar content. Of course, if honest images from video were in datasets, then using convolutional LSTM networks you could get much better results.

Dmitry (lstm with a small number of parameters) : to be honest, in the framework of this competition I was not focusing on testing different models, but on data processing. The first submit was made using the LSTM model I used earlier with 2 layers of 24 and 12 LSTM blocks, respectively. This model showed the result of 0.537852, which immediately brought me to the 2nd place. In further submissions, I made only 2 changes to the models: I reduced the size of the recurrent neural network by 2 times (to 12-6 blocks) and made the layers bi-directional. I saw a huge potential for accuracy growth in the pre- and post-processing of data, and not in models.

6. Did you learn anything useful / interesting from this competition? In which direction are you planning to move on?

Denis : definitely yes. I tried various approaches to visualize data and reduce dimensionality. They did not help to improve the result, but gave food for thought and invaluable experience necessary for further work. Currently, I want to fight like a gladiator in all competitions, preferring the tasks of computer vision.

Pavel / Aleksey : yes, the competition turned out to be very useful in the form of a large amount of experience gained in combating retraining of recurrent models, as well as their further use for building Level 2 models in conditions of insufficient data for training.

Dmitry : I, of course, brought out the experience and ideas for working with missing data in multimodality. The problems of missing data are given little time, although they always arise when collecting data in real conditions. For example, recently my colleagues and I collected data for a project to recognize emotions and one of the modalities was eye movement and gaze direction. The apparatus, specially designed for this purpose, lost its ability to collect data during a sudden change of lighting or when viewed against bright sunlight. Also short-term failures (mainly in the form of emissions) were given by other sensors. In this case, we have a situation similar to that presented at the competition, so I consider the acquired knowledge to be very useful in practical terms.
In the future, I will continue to work on the development of emotion recognition systems, paying more attention to the missing data.

7. We do not exclude the possibility that in one of the next competitions we will focus on speech processing. What problem from this area would you be interested to solve? What other topics would you be interested in contests and competitions for?

Denis : everything connected with emotions is an interesting challenge, but there are few competitions. Speech processing is generally a bomb!

Pavel / Aleksey : 1) determining the temperament of a person by speaking, 2) determining the panic state of a person by speaking, 3) a lie detector.

Dmitry : the topic of speech processing is not very close to me, therefore it is difficult to answer this question. I will be interested in any emotion recognition contests. I would especially like to see the task close to real conditions, for example, on the basis of social networking data (text, photo, video).

If you also have original ideas for contests, lying in the plane of speech processing, or related to the recognition of emotions and behavior, please share them in the comments.

Publication author:
Alexandra Smirnova, expert on crowdsourcing projects and external relations of the Neurodata Lab

Source: https://habr.com/ru/post/344268/

All Articles

On the results of the MERC-2017 contest: an interview with the winners

More articles: