In the last weekend of September, our team took part in the hackathon M.Video on data analysis. The choice was offered two tasks: first - to generate a product description based on reviews of products, the second - to highlight the most important characteristics of products based on the directory, data on the joint views and adding to the basket. We solved both tasks. Under the cut, the story of why we filled up this hackathon and what we learned.
When our team met before the start of the hackathon, we were confused by the lack of a clear statement of the task or wishes. In the absence of the best, we began to generate ideas ourselves:
We dreamed of provable increase in business metrics: views, conversion, average check and loyalty. For several days we tried to pull information from an insider, but as a result we only got more confused. When we arrived at M.Video’s office on Saturday, we had several options in our head about what could be done - and not one of them ended up coinciding with what we were really asked to do.
It turned out that you need to generate descriptions of goods or identify their most significant characteristics. And there will be no metrics, but there will be jury eyes.
All three engineers in our team sat down to solve the tasks that we thought were correct. Given the novelty of the data and a large number of experiments, it took the whole evening. By late night, we had intermediate solutions, which we finished all morning. By the deadline, we had three algorithms that were little related to each other. Nobody did a review of algorithms, except for their tired creators. It seems no wonder that we are not among the top three winners. But the algorithms themselves seem worthy of attention.
The first of the two tasks of the hackathon was formulated as a task for automatic summatization of the text . It was necessary to create an algorithm that, according to a multitude of customer reviews for the product, generates one synthetic review containing the most important information about the product. There were no formal requirements for synthetic reviews, so we ourselves came up with our Wishlist:
From these requirements it is already possible to proceed to the general scheme of the algorithm:
As a result, we got some synthetic reviews like this (on TV):
For the first example, the pros and cons of our approach are visible. Plus - that review is readable. Minus - for example, that the 2 and 3 statements could be combined, but our clustering turned out to be too detailed. Unfortunately, no way to check the quality of the generation of synthetic reviews, except for the method of gaze, we did not come up with. A method of close (but quick) look showed that the reviews lack the structure, but in general look adequate.
Another branch of the work was associated with emotional coloring ( tonality ) reviews. For most of the reviews, a rating of goods from 1 to 5 was attached. During the selection round, we also solved the problem of determining the tonality - predicting it from the text of the review. To do this, each normalized word was turned into a binary variable, and linear regression was trained on them. A large positive coefficient in front of a word in such a model means that it is usually found in positive reviews, and a negative one - in negative ones. The greater the coefficient (in absolute value), the more the word influences the assessment of the product.
For each product, we built such a regression on all the reviews on it (if there are enough of them. By sorting out the coefficients, you can find the most "positive" and "negative" words about this product. or “disgusting.” To exclude them, we had to make a special black list. To do this, we simply built a model of tonality on absolutely all products, and dragged the most significant coefficients out of it. They basically corresponded to non-informative value judgments ipa "good / bad".
The words that remained after filtering through the black list turned out to be really quite informative. For example, for IPhone 6, the word “quiet” turned out to be the biggest negative factor - and indeed, this is one of the main problems that its owners are complaining about.
Probably, if we had embedded an emotional assessment of the properties of goods in synthetic reviews, this would make them attractive enough to ensure our victory. But each of us, until the last moment, debugged our own model, and instead of normal integration, we simply recorded the output of one algorithm under another.
In addition to reviews, we had information about product views and adding them to the basket. We decided to find out what properties of the product determines the conversion of views into purchases. For any product, we had characteristics from the reference book: dimensions, screen resolution, processor brand, and so on. We converted them to binary signs. For them, for each category of goods, a logistic regression was built, predicting the probability of adding goods to the basket. The obtained coefficients for the properties of goods can also be interpreted as the importance of these properties when making a purchase decision. Then we did with them in much the same way as with the words in the reviews: weighed the importance of frequency, and selected the most important feature from each semantic group. As a result, for each product we received its most important characteristics from the point of consumer behavior, but we did not have time to compare them with the results of the analysis of reviews, much less merge with them.
I would like to thank the organizers of the hakaton for an interesting task, qualitative data, and delicious food. According to the general impression, these days and a half were very good.
From the point of view of the organization, the hackathon lacked only the quality criteria announced from the very beginning, by which decisions would be compared. Maximizing an unknown function is not the most productive. There was not enough clear formulation of the problem before the hackathon began - the phrase “aggregation of reviews on the site” and “assistant seller for the selection of goods in the store” could be interpreted as you please.
In hindsight, it is already clear that even in such conditions our team should not have rushed in different directions, like a swan, cancer and pike. It was necessary to immediately determine how the ideal product description that we want to get at the output should look like. From the first hours of the hackathon, we could concentrate on any one direction (not so much what it was), and by the evening have a working prototype. Then for the second day of joint debugging, we would bring it to an already well-functioning product.
In any case, we received a code that can be reused, a large number of merch, and, most importantly, experience. To solve the simple tasks of the store, it took almost half of the introductory course on machine learning. But everything was decided, as usual, not by knowledge in programming and mathematics, but by organizational skills.
Source: https://habr.com/ru/post/340694/
All Articles