New Yandex Search Platform with Personal Results: “Kaliningrad”
Today we announce important changes in the search for Yandex. Now search results and search hints will be personalized and may be different for each user who sets the request and receives a response from Yandex.
Especially for Habrakhabr, we interviewed people who were involved in this project, and asked them about what it is for, how it works, what factors we take into account, and how we measure the benefits of it.
')
Once upon a time, in order to show a person search results, search engines had enough user queries and their own index. These two entities are easy to imagine. But over time, it became clear that there is another very important thing - the context of the request. Who, whence and when it sets.
The simplest example of a query where it matters is pizza delivery. People from Moscow and from Volgograd should see links to those delivery companies that work for them in the city. And in some cities where the service is not yet developed, on request “pizza” you need to show its recipe at all. Since the search for Yandex began to take into account the location of a person, requests that specify the region have been entered less frequently by 30%.
Last summer, we launched the Reykjavik search platform . She understood the language preferences of users and often took into account whether a person opens the search results in English. People who are more often looking for English-language resources, the search began to respond with a large number of links to them, and vice versa.
Now we are talking about our next search platform “Kaliningrad”, which provides users with personal search prompts and personal search results.
Let's tell more about each of the parts.
Personalized search tips
Search as a tool for navigation on the Internet should help a person to stick to the course and find the shortest routes to the goal. These tasks solve search cues for several years. They help correctly, and most importantly, quickly formulate a request. We talked about the fact that they also learned to take into account a large number of factors that are also context-oriented. One of the most important launches, which helped in personalizing sadzhesta, was the accounting of the previous user request. That is, if a person was searching for [Titanic], then when typing the letter “K” in the search line, he will see the prompts [Kate Winslet] and [how they shot the Titanic], rather than [contact] and [subway map], among the first.
Half of all current requests that people ask Yandex are related to what the previous one was. We have learned to extract data from user behavior that can be used to predict their behavior in the future. Now the search hints take into account the history of your relationship with Yandex: what queries you asked and what links you followed, how these actions were distributed over time.
At the same time, it must be remembered that the user for Yandex search looks like this:
That is, we do not store the listed data explicitly. After they are processed, a relatively small set of numbers is generated for each person. In it, each characterizes some specific topic that our user is interested in. Moreover, it takes into account how strongly she is important to him now.
Make search tips more personalized turned out not immediately. It seemed that we could isolate some clusters of users, for example, using the k-means method . But it turned out that such mechanical methods do not work very well. And we decided to go the other way, highlighting the semantic themes. It turned out that the minimum of them should be 400,000. The breadth of human interests surprised us too.
In the course of our work, we understood how quickly such interests could become obsolete. In fact, even if a person is interested in programming in functional languages, right now he may be worried about repairs in his apartment. And it was important for us to understand that he can consider one thing as his interests, but in reality it is now something else. For development, this meant that it was necessary to organize the delivery and processing of data so that they did not become outdated for this particular user.
In order to understand whether we have achieved our goals and were able to make prompts such that they can be called personal, we used two methods. First checked that all this works on historical data. We have a set of actions that users have done before. Using them, we tried to foresee the following. We looked at the sign from which the system recognizes the request without personalization and with it. We included this variant of it first for five, and then for 10% of users. Next, we compared how they interact with the prompts and the control sample of the same size, but with the old version of the greet. As you understand, 5-10% of Yandex users are millions of people. The experiment showed that we can turn on the new system at all - users liked it.
Personalized Search Results
The second part of the changes we announce today is personal search results. If earlier, as we already said, the search results could differ depending on what city or place a person is in, now everyone has a chance to get results tailored personally for him.
In fact, now there are as many of their options as Yandex search users. Taking into account our knowledge of a person, his interests, what sites he prefers, and much more.
In practice, this means that, for example, the answer to the query [Northern Lights] will differ for different people. We will show the traveler an answer about a natural phenomenon, a Muscovite who is interested in shopping - a shopping center, a movie lover - links to information about the film.
Personalization allows you to improve responses for 75-80% of each user's requests. We measured in detail the effect of improvements on search due to personalization. For example, people click on a personalized first result by 37% more often than non-personalized. To achieve this, we conducted experiments with more than 10 different ranking formulas and adjustment mechanisms, and more than 50 million users have seen experimental results during this time.
According to our estimates, personalization as a whole allows each person using Yandex to save 14% of time, having received the answer sooner, for which he came.