📜 ⬆️ ⬇️

Evolution of the list of recommendations in SmartProgress

Sampling the most interesting user content for users is an actual task for many projects, and we are no exception. In this article I want to tell about how we solved this problem from the moment of the start of the project until today using the example of the list of targets in SmartProgress .




')
Stage 1: Moderation.

This is the easiest way to organize the process of selecting custom content. Therefore, at the start of the project, we resorted to this method. Its main advantage is speed and ease of implementation. What could be simpler - to go through the list of new content and mark the most interesting, and then in the sample to take into account this mark. But the disadvantages of this solution are significant:
1) A person is required to keep track of new targets. And this person needs to constantly pay wages. Or spend your time on this operation.
2) The moderator selects goals on a subjective basis - as he thinks, and not the fact that other users share his opinion.
3) With such a selection, a single list of goals is created, which is shown to all users, and all users have different tastes and preferences, and the fact that one user may be interested, to others, is completely indifferent.



Stage 2: Monitor user preferences.

In order to implement user preferences, we needed to think much better about the goal combining scheme than just a list of categories. For this, we have introduced the concept of a group of goals. A group of goals is a list of all the goals of users united by one specific topic, for example, “Promotion and development on the Internet” or “Moving abroad”. We have already created more than one hundred such groups. In more detail about the group, and how we made them can read in our articles:
Similar goals. New search tool for like minded people
ElasticSearch and search vice versa. Percolate API

Now most of our goals are in groups. And knowing what goals the user sets and for what purposes he signs - we know his preferences. We receive the list of groups in which these purposes consist - and we show other purposes from the given groups.

As a result, the list of recommendations has become more personalized, taking into account the preferences and interests of each user.

But by tracking this recommendation system, we found several problems that are still relevant to it:
1) Not uniform distribution of goals in groups. Those. if the user has subscribed to 1 goal in group A, in which the goals are much larger than in other groups, then the user will be dominated in the recommendations of the target from group A.
2) If there is an insufficient flow of new targets for its relevant groups, recommendations become stagnant. Since with such a target list system, we artificially limit the selection of targets only to the user's preferences, then we reduce the flow of new content to the user, and the user constantly sees the same goals in his recommendation.



Stage 3: Do not show goals. Which user has already watched.

In order for the user not to blur the eye with the same goals, we decided to hide from the output the goals that the user has already viewed. To do this, we added the table `goals_ignore`, in which the goals that the user looked at, as well as which he stubbornly ignores, are entered. Those. if a user in the list of recommendations was shown a goal 10 times and he never entered it, it means that he is not interested in him and that he doesn’t need to show it to this user anymore - we put it in `goals_ignore`.

Thereby it was possible to achieve that the list of recommendations began to be cleared of goals that the user has already looked at or does not want to look at.

Stage 4: Probabilistic sampling of targets. We show first of all the most interesting.

For this, we introduced the CTR for the goal, which was calculated from the ratio of the opening of the goal to its views in the list of recommendations. And they began to sort the list of recommendations for this indicator. Thus, the user first sees the most interesting goals. But with such a scheme, less interesting goals fall lower and lower and the ability to get to them is constantly decreasing. Therefore, we decided to introduce a certain randomness in the selection of targets. What would be less interesting goals still had a chance to be on the first lines of the list and thereby correct their ctr, in case of an erroneous underestimation.
To do this, we launch the script for calculating the random coefficient of targets based on their ctr.

UPDATE goals SET order_random = ((list_click/list_view)*RAND())

And already by the order_random parameter we sort the issue.
Due to this, the list of recommendations began to rotate, and goals that had long gone deep down began to appear on the first positions of the list. In addition to improving the list of recommendations for users, this step gave new impetus to old goals. Users entered their forgotten goals, brought activity and the authors returned to their goals again.

Stage 5. We follow the degree of user preference.

If a user sets more goals in one group or subscribes more to the goals of a specific group, then this group of goals is more interesting for him than others, then in the recommendations he needs to show more goals from this group respectively.

And so, with the help of stages 3-5 we managed to solve the problems described for the list of recommendations from the second stage, namely:
1) Not uniform distribution of goals by groups
2) Stagnation of the list of goals with insufficient flow of new ones.

But a new problem emerged. We do not know the preferences of new users who have just registered and have not set a single goal and have not subscribed to others.

Stage 6. You do not know what preferences the user has - ask him.

After registering, we show users a site guide that helps them to quickly deal with the interface. A small questionnaire was added to it, in which they asked to mark the categories that are most interesting to the user. And on the user's responses, they began to build his list of recommendations until the set of a certain number of subscriptions was recruited, after which we switch the user to our standard recommendation system.



As a result, the average depth of views increased by 15% from 3.9 to 4.5. And the average length of stay is 36% from 5.5 min. up to 7.5 min



In one and the following articles we will tell you how it all works quickly.

SmartProgress - achieving goals

Source: https://habr.com/ru/post/232279/


All Articles