📜 ⬆️ ⬇️

How data sharing affects the quality of recommendations

Hi, Habr!

We pay special attention to the integration verification when a new client is connected to the platform and constantly monitor the integration status in the process. Why is this critical? Because data collection is the basis for generating quality recommendations.


')
The work of the recommendation system is based on several important components: data collection, data storage, processing, issuing recommendations and growth hacking. Plus "iron" to ensure the computing power of the algorithms and the layout process. Thus, we get at least 7 points on which the quality of recommendations depends, not to mention an expensive team of analysts. Both the external service and the internal recommendation system of the online store should cover all these points and ensure the quality of work at all stages.

Algorithms are considered a black box, and there is a perception that you can quickly get 50-70% of the effectiveness of a recommender system based on open source software. But software is only one of seven points, which, if it gives an efficiency of 50-70%, then only one point. Those. without the other components, this is about 7-10% of the system's efficiency.

One way or another, we communicate with almost all technology companies on the Russian market and in other countries where we operate, and we can confidently assert that our approach to ensuring the quality of data exchange is one of the most thoughtful in the world.

Over the years of practice, we have gained tremendous expertise in every item. We often talk about the results of our algorithms and the success of the Growth Hacker team for the stores, but today we want to stop at the very first point - data collection.

If any problems arise at the collection stage, they will directly influence the further stages of the formation of recommendations and, thus, can destroy practically all the value that the online store could receive. That is why we place such high demands on the quality and completeness of the data.

In our practice, we have encountered various cases where, due to the nature of internal processes, technical failures, or simply due to inattention, incorrect or incomplete data are transmitted, which negatively affects the quality of service. For example, product id or category changes several times a week, goods are duplicated or not all are transmitted, etc. Such factors strongly influence the accuracy of the recommendations.

We offer our customers not only our own technologies, but also expertise, including in terms of quality control of data exchange, so we’ll tell you about several important parameters that need to be taken into account when transferring data to a recommender system.

Retail Rocket Integration


Integration with the Retail Rocket system is carried out through the installation of tracking codes. On different pages of a site scripts are placed according to certain rules. Directly on the installation of tracking codes takes about a couple of hours. But due to the high demands on the quality and completeness of the data, the process can be delayed. For example, an online store prepares a YML file according to the standard requirements of other services and lacks important details for forming recommendations. Since data collection is of great importance for all further stages, we carefully work out each integration point.

Immediately after installing the tracking codes, our experts check the correctness of the scripts on all pages, as well as the quality and completeness of the transmitted data. There are situations when codes are not placed on all pages, not all events are tracked (additions to the cart, orders) or not all goods, their properties and other important parameters are transmitted in the feed. For our team, this is a whole process, where, at every stage, the correctness of tracker installation and data transfer is monitored by specialists - from account managers to technical support.
This is how a part of our kanban integration board that every client passes is like:



Integration Verification Options


During integration, we check dozens of parameters. We talk about the most important.

Completeness of the product base


The correctness and completeness of the goods and categories transferred, as well as their compliance with the products and categories on the website, is one of the most important parameters for generating recommendations.
The product catalog transmitted via XML Feed must exactly match the site menu structure. To check the manager selects a number of random products from different categories to ensure that the product is in the same nested categories as on the website of the online store.

In addition to the structure, the number of products and categories in the file must match the number of products on the site. With the help of special reports, we can track whether all products placed on the site are transmitted in a YML file. The list of products that are not in the feed is formed on a separate page, and the account manager can immediately send it to the client.



In addition, it is important not to remove the item from the feed when it goes out of stock. For goods not available can conduct contextual advertising, links from search and other resources. The user who got to the product page is not available, has already formed demand, i.e. He is ready to purchase and he can recommend the most similar alternatives. Therefore, it is important not only not to remove the goods that are not available from the feed, but also to transfer the maximum parameters so that the recommender algorithms can form the output of alternative products.

Accounting for regional parameters


For online stores with offices in several cities, it is important to transmit data for each of the regions. In different regions, the goods may have different costs and different availability, so it is important to take into account these data and transfer them to the feed.

In addition, it can be important for marketing tasks, for example, in some cities customers are more important than discounts, and in others bonus points. Using the regionality parameter you can optimize marketing campaigns.

Transfer of product properties


Another important parameter of the data that needs to be transmitted for the qualitative formation of recommendations is the properties of goods, such as color, size, brand, etc.

Two points are worth noting here. First, for some industries, taking into account certain properties of goods is crucial for the formation of correct recommendations. For example, for a shoe store, this parameter will be the size, because the recommendation, even if it is a very similar pair, but the 40th size instead of the 36th, is unlikely to interest the buyer. To address this issue, in October 2017, we improved the personalization mechanisms for fashion-segment stores to take into account the size of the user's clothes or shoes in blocks of recommendations.

Secondly, if the stores take into account each size and color as a separate product, other sizes / colors of the same product may be shown in the recommendations of similar products, since for the algorithm they look as similar as possible. For example, to a ring of the same size as alternatives, recommendations are made of exactly the same rings, but of a different size. In the Retail Rocket platform, this is solved using group products.

Change the product ID or its properties


If the ID or other properties of the product change too often, for example, the product is constantly moving from category to category, the quality of recommendations may drop significantly. If at the time of calculating the recommendations of related products, the goods were in one category, and at the time of the request for recommendations, they already moved to another - the issue may turn out to be irrelevant, and it will take some time for the algorithm to reshape the links between products and categories.

In addition, this may occur due to violation of business rules, for example, when conditions are written not to recommend products of one category to goods of another. When id products or categories change, business rules break down and they need to be re-written so that recommendations are built with high quality.

Verification of data analytics systems


Another important indicator by which you can check the completeness of the data and the correctness of their transfer is the coincidence between the numbers in the sales report of our platform and the client's web analytics system (for example, Google Analytics). We compare the number of orders and the amount of revenue - the difference should be minimal.

Thus, we understand whether all tracking codes work correctly and whether all data is transferred to the system accurately. If there are any differences, we begin to "dig" and look for reasons. For example, the tracking code may not be installed on any page, and because of this, the data is different. That is, it is a kind of marker, which is immediately visible if something is wrong.

Website synchronization and email


This item is important for several reasons. Firstly, thanks to this synchronization, we understand how effectively the trigger letters will work in a particular online store. For example, is it possible to send an email about an abandoned view to a user whose email is in store.

We also track trigger emails every day. If on the chart we see that today the number of sent letters is less than the average for a certain period, check what happened.

Secondly, in order to receive as much information about a user as possible, it is important to connect his profile on the website and via email. Our trackers track the addresses on all pages where the user can agree to receive mailings (registration, login, checkout, subscription forms, etc.) so that we can collect the maximum number of customer emails. This increases the coverage, the number of shipments and, as a result, the number of orders.

In addition, when a user opens a letter and follows a link, we have the opportunity to track his behavior on the site, the history of views and purchases, and thus make more accurate recommendations on the site and email-mailings.

Real-time integration status tracking


The main value of the personalization platform in its algorithms, but for them to work at full capacity, it is necessary, among other things, to constantly monitor the correctness of integration on all pages of the site.

We often write that one of our features is that we do not finish the work at the installation stage, but try to constantly improve all the metrics. This concerns not only the A / B testing of various algorithms, but also the constant tracking of the correctness of our recommendations.

We have developed a special interface with the help of which our specialists can monitor the integration status in real time and respond promptly in case of any problems.



For each possible situation, a report is automatically generated describing the necessary actions that can be sent to a representative of the online store.

For example, we have developed a separate subsystem that downloads images from an online store site, resizes them and stores them on our own CDN. In the integration status, we monitor whether there are problems with some images, we see the percentage of such images, and we can immediately send the client a report and tips for correcting the situation, automatically generated by the system.



In the event of technical failures on the client side, the storage of images on our CDN helps to keep the display of recommendations on the site and in letters.

Conclusion


No matter how high the quality of the external service is, no matter how powerful and clever the algorithms are, without effective data exchange, the results may be lower than the online store expects. As well as integration, it may take longer if the tracking codes are not installed everywhere, and the necessary data is missing in the YML file.

The quality of data exchange directly affects the efficiency of any external service, but not every service pays enough attention to checking such details. We strive not only to make our algorithms efficient and constantly improve them, but we monitor the correctness of integration with the client’s site.

Source: https://habr.com/ru/post/354098/


All Articles