Under the cut, there is a story about how machine learning appeared in Dodo. Spoiler: I launched it. Hardcore technical details will not be here, I will definitely devote a separate article to them. Today, more about the motivation and support of colleagues.

Training
I came across the subject of machine learning three times, until something worthwhile came out of it.
Russian school
I first encountered machine learning at HSE — I received a second tower in the direction of Big Data Systems when I got settled in Dodo. Having passed this huge HYIP theme tangentially, I did not understand why I spent three years of my life at all. And certainly not thinking about how this can be useful in the company. I then was not ready for this challenge of fate.
')
Czech voyage
The second time I encountered this topic in Prague, at the closed Microsoft hackathon for machine learning. Together with guys from other companies we worked on the task of forecasting demand in Dodo on holidays and peak days. Back I returned with a finished model that predicts demand. It was after this hackathon that thoughts emerged that I could apply the knowledge gained in the company. It was not there.
Well, do you have a model in Jupyter, so what? How to use it? All attempts to explain this to business faced a harsh reality: it’s clear that there will be a lot of orders on holidays and peak days. Adult pizzerias can predict sales based on data from last year, and new ones have had enough problems without it. We have put off attempts to engage in the development of machine learning. But the idea that we can do more with the data, too firmly stuck in my head and did not want to get out of there. Now I was ready for a challenge, but the company was not.
The American dream
The third meeting was fateful. Our team got a difficult but interesting task: to develop a custom pizza module for the USA. This is when you can order a pizza with any set of ingredients, create your own recipe. In the project, everything had to be worked out: from changes in the database architecture to client code on the site. We clung to the task and developed a product that I consider a real victory. Home evaluation flew to Slack from Alena, our CEO in the United States.

We did the module, but I saw a problem in scaling. What if the functionality does not appear in one or two pizzerias in the states, but in a large network? How to manage such a product, plan stocks? I decided that this case could prove the need for the development of machine learning in Dodo. I felt that this time both I and the company are ready to launch a new direction.
One on one with cars
In the background, I started analyzing the sales of American customized pizza. Using clustering algorithms, we were able to show that all the recipes created by users are based on six basic sets of ingredients plus a pair of random ones. Even a simple report based on this algorithm would allow in semi-manual mode to forecast sales and plan stocks. Due to the lack of bureaucracy and the ability to rebuild on the go, we were given the green light to begin to engage in this area.
We understood with the technical director and more than once discussed that I would need to leave the current team and take up the development of a new direction, to show that we need it. I needed to plunge into a new sphere at a fast pace. I understood that if it doesn't work out, there are two ways. The first is to return to the development of another Dodo team. The second is to update your resume at HH and look for a new job. I did not want either one or the other. I was in this state for about three months, until I caught on the module of additional sales.
First project
Another spoiler: it turned out that to start the ML does not need to boil down over something difficult. Obviously, isn't it? But it is very difficult to understand at the beginning.
The module that offers to add an additional product to the order is not directly controlled by anyone. This means I can do with it what I want. Cherry on the cake - an opportunity to increase sales through more personalized offers. Previously, the module worked simply: if pizza was added to the order, in additional sales the category of drinks was displayed, if pizza and drink, then desserts and so on.
The indifference of a huge number of people again showed that I work in a company where support can be provided by absolutely everyone. I worked for hours on data and additional offers with a colleague from marketing. We managed to cluster all users according to their taste preferences and loyalty, for each group to make static offers based on the top products in the cluster.
Figures and proofs
I screwed up logging of additional products and launched new offers on a sample of 2 million users.
A sample of users is only a small part of sales. We had to move in the direction of unauthorized and new customers. I shoveled enough articles and literature about Collaborative Filtering and various suggestion algorithms for users. Won the idea of ​​recommendations based on the products in the basket. Item-Based Recommendations and the cosine measure of convergence formed the basis of a new, albeit simple, but already working model.
In December, we launched the Item-Based recommendations module. Statistics have shown that buyers can really be interested in completely different products, and not just in drinks. Perhaps it was after this that Dodo believed that the data and the development of machine learning would compete in the overloaded supply markets in the future.
Some statistics.

10 best-selling products on the site

10 best-selling mobile app products

Weekly sales growth
Technical trailer
Below are some technical details of why the model is based on a cosine measure of similarity. This is a preview of the article, which will be released in a couple of months. If you don’t like math - feel free to jump to the last section.
The source table below shows the number of orders with each user's purchased item. We can determine the similarity of purchases of one user with another - for this we need to calculate the distance between users' vectors.

Product sales table by customer
The distance will depend on the selected metric. The calculation of the Euclidean space includes the weight and size of the vector:

where a and b are two different client vectors from the table. Let's see how this distance will look like on an abstract example.
Suppose we look at the history of three customers — a, b, and c. Construct a matrix of their purchases.

By calculating the Euclidean distances between customers, we obtain the following values:
d (a, b) = 16.22;
d (b, c) = 13.38;
d (a, c) = 13.64.
These values ​​indicate that customers b and c are closest to each other. But if you look at the source data, the picture is the opposite. Customers A and B prefer to order more Pepperoni and occasionally other products, while Customer C prefers Supreme pizza. We can conclude that the magnitude of the vector has a negative effect for calculating the distances between customers. The cosine measure of similarity just takes into account the angle between the vectors, discarding the significance of the magnitude of the vector:

Calculating the distance using this formula, we get:
d (a, b) = 0.9183;
d (b, c) = 0.5848;
d (a, c) = 0.7947;
We see that clients a and b are closer to each other. They prefer one set of products without taking into account the difference in the number of orders made. This logic is consistent with our expert opinion and suggests that customer preferences a and b are closest to each other.
This is a trailer, details in two months.
Search your
Now we are at the stage of forming a team, in which there will be specialists in organizing data storage, developing machine learning models, putting them into production. But most importantly, we now better understand why we need all this. We are free to do really cool things, from the organization of an intelligent logistics system and inventory planning to fantastic pizzeria automation ideas using Computer Vision technologies.
Believe in yourself and your strength, even if the result is not visible on the horizon. I would like to complete the article with someone else's thought - a quote by Max Weber from his report to the students of the University of Munich: “You can’t do anything alone with anguish and expectation, and you need to act differently - you need to turn to your work and comply with the“ requirement of the day ” and professional. And this requirement will be simple and clear if everyone finds his demon and obeys this demon who weaves the thread of his life. ” Find yours.