Blender technology. How Yandex cleverly mixes different types of answers

Today we will tell you about our technology called Blender. It provides ranking and embedding of blocks with vertical searches into the Yandex search results page.

Perhaps, it’s worth starting with why we use vertical searches at all. In some cases, searching by verticals is much more efficient than standard web search. For example, when the user needs to find information of a certain type (pictures, video). Some queries imply other ranking criteria: when searching for products, it is important to be able to rank by price, and additional filters should be taken into account when searching by people. Vertical searches can also suggest completely different ways of interacting with the user, as is the case with navigating through the results marked on the map when searching for the nearest store, cinema or gas station.

Of course, with such requests it is best to contact Yandex specialized search services: Pictures, Videos, Maps, Music. However, this requires additional actions from the user: you must first enter the address or select a specific search engine. Enter the request into the browser's omnibox and search with the default engine - the most simple and common scenario. Our task is to provide relevant results and embed verticals where necessary.
')
But to determine how this or that vertical corresponds to a specific query is not so easy. After all, the request does not always fully reflect the needs of the user. Simply put, we can not know exactly what the user had in his head when he created this request. For example, if the request is the name of the bank, it is difficult to immediately say what the user wanted to see: a link to the official website or the location of the nearest branch. It happens that the query has an even more multi-valued object. Suppose the user hammered in the request [Harry Potter]. He could have in mind both the series of books and the series of films. And suppose that one of these films is currently going to the cinema, and all the others have already been released on discs. Immediately determine whether the user would like to read a book, order a ticket to the cinema, buy a disc, watch a movie online or download a file with it is impossible. Our task is to determine which verticals correspond to possible needs and embed them on the issue page.

Determine the needs

To begin with, we run the query through all verticals. Applying heuristic algorithms, we cut off the needs that have the lowest probability for each particular vertical.

Then, on the basis of machine learning, we determine the category of the request by the criterion of compliance with specific verticals. Based on these categories, we have the opportunity to predict the likelihood that the vertical corresponds to the query.

Now that we can determine the probability of need for each vertical, we need to make a ranking. We have both rather isolated verticals with completely separate ranking algorithms (pictures, video), as well as small add-ons above the web search, which add new criteria. If the need is the type of content the user needs, isolated verticals come to their aid with their ranking. But there are requests for which the second type of verticals are most relevant: searches on web documents with additional criteria and taking into account new data. For example, we can arrange documents for freshness, for the price of goods, etc. We can build such verticals directly above the main web base, having previously collected the data on the basis of which the ranking will be conducted. For this we need the following:

determine the type of content;
identify its signs;
trace their values in the document using microformats;
to make instructions for assessors on marking and relevance of vertical positions for each vertical.

If the site does not use microformats, we can apply our own mining scheme. The operator can easily determine the scheme of mining for each individual site, mark the location of various characteristics of the product: name, price and description. After marking all these characteristics can be namineny. Thus, we can significantly improve the quality of search.

Quality

To determine how high-quality output was obtained for one need, you can use the pFound metric. Its result will be an estimate of the probability of finding a relevant result in the ranked list. The formula for the metric is as follows:

Where pLook [i] is the probability to view the i-th document from the list, pRel [i] is the probability that the i-th document will be relevant. The pRel [i] values in our model are calculated from the relevancy scores for the query. The probability of viewing the document is calculated using cascade models. In our case, the user is viewing the results from top to bottom one by one. He continues to browse if the previous version was unsuitable, and terminates the search session with the probability pBreak:

You can present all this in dynamics by the following scheme:

The user views the issue page from top to bottom. With probability pRel, he will find what he was looking for and with probability pBreak will get tired and leave the issuing page.

Blender, on the other hand, embeds results for several verticals and needs into a search sample. Accordingly, the wide pFound metric works better here. It determines the probability that the user will be satisfied with the response to an ambiguous request. Wide pFound is the sum of the probabilities of needs multiplied by the pFound of a specific need. In other words, the sum of the probabilities that a user will see a certain result multiplied by the probability of the relevance of this result.

Verticals in the Islands

In our new search interface, the Island provides more opportunities for interaction with verticals. For example, if a query matches certain vertical searches, you can switch between them using the icons in the left pane. So we implemented the idea of a relevant interface, dynamically cutting off all currently unnecessary elements.

In addition to the links in the left pane, vertical searches are displayed as blocks on the main search page. This is another way to specific vertical engines.

Source: https://habr.com/ru/post/197838/

All Articles

Blender technology. How Yandex cleverly mixes different types of answers

Determine the needs

Quality

Verticals in the Islands

More articles: