Recommender systems: recapitulation

Note: Below is a translation of Alex Iskold's Reth thinking Recommendation Engines (famous for his research in economics attention and theoretical foundations of the social networking mechanism), in which the author reviews current recommender systems and tries to predict what will happen in the future a way to improve them).

Neflix

More than two years ago, Netflix announced a recommendation engine competition : anyone who invented an algorithm to improve the quality of their recommender system by at least 10% won one million dollars. Many research groups enthusiastically set to work, inspired by the amount of information available for analysis. Some progress was made at the very beginning, but then it slowed down, and now the researchers stopped in the area of improvement by about 8.5%.

In this post we will explain why improving the advisory engine is not an algorithmic problem, but rather a question of representation . Rethinking recommendations as filters and applying them without a focus on a high end result seems to be more likely to succeed than faster “ crunching ” data.

Creating a recommendation engine is not an easy task; we discussed it a year ago. In addition to significant technical difficulties, there are also psychological problems: do people want recommendations, and if they want, how much do they trust them? Perhaps the more important question will be the following: what happens when users get one or more bad recommendations? How soft will they take it?
')

Genetics of recommendation engines

All recommender engines try to solve the following problem: to provide a specific user with such a set of estimated indicators, which participants from the entire user base like to those participants who like the same new items as the user himself. There are many algorithms that can be applied to solve this problem, all of them focus on three components: personal, social and fundamental .

Personalized recommendations are based on the individual's previous behavior.
Social recommendations are based on the previous behavior of users similar to this one.
Essential recommendations advertise certain elements to the user based on their individual properties.
Mutual use of all three of these approaches.

Social recommendations are also known as collaborative filtering : people who love X also love Y. For example, people who like Lord of the Rings are also likely to appreciate Eragon and the Chronicles of Narnia . The problem with this approach is that people's tastes do not really fall into simple categories. If two people have common tastes in fantasy films, this does not at all mean that they will also love drama or detective stories. A good illustrative example is the genetic characteristics of a person. Many times we meet people whose appearance seems familiar to us. For example, the eyes or lips of two people may look very similar, however, they are completely different people.

Another type of recommendation is the substantive recommendation . The best example of such a system is the Pandora music recommendation service. He works on the basis of the ranking of each piece of music for more than 400 different characteristics - musical genes. It then automatically matches the pieces based on their characteristics. There are a number of problems in improving the algorithm, as well as in its application in other areas. For example, for films you will need to evaluate each of them in many ways, from the director, the cast and the general plan, to more obscure things, such as music, location, lighting, operator work, etc. This is of course , can be done, however, it is all very difficult.

Guy in the garage

The complexity of the problem of providing the user with good recommendations is also in the wide field of possibilities for this. The situation is very similar to determining which particular gene is responsible for a particular feature in a person; it is also difficult to establish which bits in a music or video file made us give him 5 points. Reverse engineering of human thinking is very difficult. That is why one of the main contenders, noted in this article, uses a very interesting way to ensure the accuracy of the recommendations.

Gavin Potter from London, taking on his nickname “ Guy In The Garage ”, relies on human inertia. Obviously, the rating of the film depends on the rating of previous films that we just watched. For example, if you watch three movies in a row and vote for each rating of 4, and then watch a better movie, then you will most likely give it 5 points. Otherwise, if you rate these films with only 1 point, then the next film, for which you vote with 5 points, will receive only 4.

If you think that such an algorithm is working incorrectly, then you can be sure that it is now in 5th place and is still making progress, although the other algorithms are slipping in their improvement. Improved formulas, along with a drop of human psychology, are indeed a very good idea, and this is something that we will discuss a little further.

Replace recommendations with filters

How many times did you get into the following situation: a friend advised you of a movie or restaurant - did you go there in anticipation of something good, but, as a result, you were disappointed? Quite a few times! Obviously, such advertising inflates our estimate, increasing the chance of failure. In mathematical language, this type of failure is known as an error result ( false positive ). Think about what will happen if, instead of recommending a movie, a friend advises you where you should n't go, what should you not spend your time on?

What can happen bad in this situation? Yes, almost nothing, because you most likely just do not go to this movie. But even if you go for it, and you like it, you will not have an unpleasant aftertaste. This example demonstrates the difference between our reactions to an incorrect error ( false negative ) and an erroneous result . Erroneous results are frustrating, but wrong ones are not. The idea of reviewing the recommendations is to use filters as an application of the above phenomenon.

When Netflix recommends something, it almost certainly dooms itself to failure. Very soon she will recommend you a movie that you don’t like. What if, instead, it will offer you new items, and put a button next to each one: filter out the ones that I most likely don’t like. The algorithm will be the same, but its perception will change.

Real-time filters

This idea becomes especially important and effective in the era of news in real time. We are learning to filter new information more and more. We do this every day in our RSS aggregator. We think about the world in terms of news flows, when things in the past are no longer relevant. We do not need recommendations, because we are already too signed. We need noise filters ( noise filters ). We need a simple algorithm that would tell us: “Hey, this is definitely not what you need,” and would hide it.

If machines can perform this active filtering of information for us, then we can ( more efficiently ) work with the remaining information on our own. We are already preoccupied with spam in mailboxes, so we were just happy if there was a “filter for me” button, and maybe all the information flow itself would be filtered without our participation by default, so that we could do more of life.

Conclusion

Creating the perfect advisory engine is a daunting task. Despite the chosen algorithm, social filtering or the internal properties of things - the recommendations are a very suspicious business: if the user gets an erroneous result, then most likely he will refuse such a service altogether. Perhaps considering the problem from the point of view of the psychology of the problem will be able to make people evaluate what the algorithm really does. If instead of recommending, cars can simply filter things. which, apparently, we will not like, we can be more tolerant and understanding.

And now, please, tell us about your impressions about recommender systems. Do they really work as well as stated? If you do not take into account movies and news, where it would be good to apply the above filters?

Thanks to everyone who read this translation. Please leave your thoughts and comments on the stated issues.

Web Optimizator: checking the speed of loading sites

Source: https://habr.com/ru/post/31480/

All Articles