📜 ⬆️ ⬇️

Yandex search will be able to adapt to your interests in a few seconds.

Starting today, Yandex search personalizes the answer not only based on the history of your interests - it will take into account what you are doing on the search right now. This is an important change for users: what we expect to receive in response depends on what we are doing now. Yandex search becomes adaptive. To make this possible, we had to implement a new technology for delivering data in real time.



To understand what the user wants, you need a context. Once upon a time, the entire context that was available to Yandex was the text of a search query. Over time, we learned to consider which region it is set from.
')
An important step was the personalization of the answer for different users - we began to use the knowledge of a particular person to give him a more accurate answer. To do this, we used data calculated on the history of requests and user clicks. Moreover, they were calculated both in history for a long period and for a recent one. We told on Habré about this stage.

Each time the addition of the user context entails a change in the quality of the system, and sometimes the user behavior. For example, when we began to take into account the region from which Yandex was asked to request, people stopped manually specifying the name of their city in the search box. There is no need to write [Novosibirsk refrigerator], if the search engine already shows you local offers. And the user will save a few seconds on the character set in the request.

Using the user's search history led to another qualitative leap. Yandex began to more often show the user's favorite sites, choose topics that are closer to a person.

But user interests are not static. More than half of all search interests live for less than one day. For example, a person as a whole is a gamer, but at some point he will want to find where to watch a movie based on some game. Or find out what kind of film and no longer remember it.

Or a person learned the weather forecast and decided to buy an air conditioner. At this moment, the search engine needs to be able to quickly adjust to its new interest and respond to the requests and clicks that it has just made.

Yandex search rebuilt in seconds

To take into account all these fleeting user actions, you need to be able to very quickly transfer and process data about them that will be used in the ranking.

Realtime MapReduce


Thanks to our new technology, which allows you to deliver data in real time, in 95% of cases, human-made actions become available for participation ranking in 7 seconds.

For this, about 10TB of logs per day are processed in real time, with a load of up to 200mb per second. However, it is fully compatible with current data processing technologies, including reduce operations.

This change entails another interesting consequence. Search results are no longer static - the search becomes adaptive. The same query, given after different queries, may contain a different order of results and even a different composition of the top ten.

Previously, search personalization worked only for 30% of the most active users. Now all Yandex users have their own search, which adapts to them literally from the second request.

How we have implemented a new data delivery technology, we will describe in one of the following posts.

Source: https://habr.com/ru/post/181514/


All Articles