Extended regularization of neural networks in online stores - with the help of ... napalm

Winking to grandfather Einstein , adjusting the napalm satchel and smoothing a stylish black T-shirt with the image of a formula for the law of normal distribution , the lead analyst threw open the doors of the PR department, smiled brilliantly and asked: "Guys, continue to collect e-mail clients in exelics and creative using the left-hand walk method with your eyes closed? ". Having received a joyful “aha :-)”, the fighter mentally thanked John Napier for the work done for the benefit of the enlightenment of humanity and the reduction of routine work and ... cheerfully pressed the trigger.

Albert Einstein has always inspired analysts to implement advanced algorithms.

After 5 minutes, the fuel in the knapsack was already over, it was quite warm, if not hot, but my colleagues (?) Did not notice anything and continued to count the likes under their posts in social networks.

“This is the concentration,” the analyst thought, took the minigun and stepped into the customer service department at CRM. "Do you guys segment customers by the number of gifts from them for the new year, and from the word 'clustering' does your head start to hurt in the forehead area?" Without waiting for an answer and whispering: “I will now teach you to count churn-rate and CLV in mind” - the analyst took a step forward and fell into the atmosphere of magical thinking, superstition and steam-punk .
')
Steam punk seems to be about applied science, but when you look closely - magic, pure magic!

Ahead was the content management department, and the quantum annihilator and the old one, but with a good timbre in the BMW style, a chainsaw, remained from the tools for popularizing new efficient algorithms. Having chosen a proven combat friend, the analyst knocked quietly on the content department door, guiltily entered and asked: “You continue to buy on amazon and aliexpress - Aha!”, “You continue persistently not to use personalization in the online store catalog, not to believe in its effectiveness and not to use AB-testing - Well, of course, all the same, something is for sale! ”. “May the force be with us” - this is the last thing that was heard from the mouth of a forest nurse, until he managed to drown out the sound of a running motor ...

From the means of implementation, there was still a secret drawer under the table and ... the development department. It was for him that the secret weapon was intended, which by efficiency exceeded the quantum annihilator that had seen a lot on the galactic boundaries ... There was enough good beer in the box for everyone and with a reserve, and the cookies relished when discussing with the dear brothers in arms the subtleties of Keras overclocking on production. Day - a success!

“A data syntacist is coming to you for an interview” - the subductor of multidimensional spaces who snarled on the couch at lunchtime heard ... and woke up. What a strange dream, why would it? Oh yeah, we are introducing neural networks in the online store - to work!

“What a strange dream I had a dream,” thought Alice and ran home so as not to be late for tea.

What is it for me? ) The problem of fasting is simple - how to make an online store bring more profits at low cost to the latest technology of machine learning. And how to convince the store owner and employees to use them! (and the hand reaches for the weapon again)

Software is close and available as never before

On the one hand, now, more than ever, available and ready for rapid prototyping and, in principle, implementation - the basic, bearded, mustached, and lively machine learning methods that can make an online store business more profitable.

Love extreme pleasures - yes, please, “pip install” and, while a cup of coffee is being poured, everything has been installed on the analyst's laptop that will allow to raise the neuron and train on a small amount of data. Get results a few percent better.

If you are sure that the model will be useful to you, and there is a lot of data expected - you can spend a little more time in a more verbose language and get a close result with a greater guarantee of fast production .

In general, with technology now everything is very accessible and you can not think too much - take it ready with support for tensors and GPUs, run analytics on a laptop which thread is ready to crap in python, customize ssh proxying and implement a service for clients.

Algorithms for the online store, which are likely to increase its profits

Here, too, everything is rosy and accessible. Let's go through each.

Personal recommendations

If the data is relatively small - take the ready matrix compressor on Spark and you're done. If there is quite a lot of data, you can shove their memory and drive classic collaborative filtering.
From fresh and tasty, and there is a budget for a cluster with a GPU - you can try to prepare Amazon DSSTNE . Here colleagues teach the classic denoising autoencoder - which “restores” the client’s personal recommendations. The model is wide-flat, but the matrices are multiplying in chunks on different kernels / machines and in C ++, leaving TensorSLOW far behind.

Amazon DSSTNE receives from the sooo long vector of the User’s Products a significantly shorter distributed representation (set of categories / subcategories of the Goods), which is then compressed into a “full” vector of recommendations of the Goods to the User

For content-based recommendations, you can first use a ready-made search engine , and then look towards the “semantic vectors” of words and phrases — starting with Word2Vec or GloVe .

Advanced search engine

Here you can often improve the classical search algorithm using semantic ranking . Just do not try to repeat the experiment of Yandex with DSSM in the forehead - look at the input dimension of their neural network and calculate how much and what iron you will need for this space flight.

Chatbot

To teach a chatbot to answer questions you need ... a lot of work with your head and hands. The easiest way to do this is on regular tests and first-order predicates stored in MySQL. You can also make molds with the parameters in the style of booking plane tickets in the United States . And in response to the popular question “how to make a chatbot on a neural network” - in response, I want to laugh hysterically, tear out leaves from a well-known little book by a famous professor , glue roll-ups and release eights on my side (the symbol of infinity) in the questioner's face.

Automation technical support

Here is also a finished story with well-known implementations. There are two problems: we need a large corpus of dialogues in the shop’s language (and for Russian, the situation with the corps is deplorable for the time ) and irrelevant answers will get to Recall @ 5 if a new question has arrived (yes, yes, that generalization ), since neural networks, for the time being, basically help, only occasionally doing work better than a person (and for that you need to be killed while training them).

Next Best Offer

Here you can track the internal cycles of the online store Buyer and at key points offer something to buy. For example, to lead the loyalty cycle of each customer. It is a lot of algorithms, from bearded Markov models to modern and effective recurrent cells .

Customer clustering

Here, too, everything is ready for a long time and easy to deploy in minutes . You can identify target groups of customers and bomb them with personal recommendations!

In our product, we have implemented a part of these algorithms so far: content-based recommendations and collaborative filtering. Well, so that was where to start ... the battle.

And how to prove that the algorithms work?

Here it becomes more interesting.

Here in your online store appeared in smart- in smart pythonists-dataseintists-analysts ...
Here you have raised the basic models of the above popular algorithms on their working laptops ...
So you started offering clients personal recommendations and making emails in clusters and offering goods to customers at times when they are “hard to refuse” ...

And ... and come!

Proving what has become better and the resources allocated to ~~the~~ smart guys, a lot of clever words and even more incomprehensible pictures in matplotlib pay off - it turns out to be difficult, very difficult.

Scary and incomprehensible pictures on analysts' laptops

Much and much regret, intelligent articles and techniques on the topic "how to prove that personal recommendations work" are orders of magnitude less orders of magnitude than fascinating articles on the topic of GAN or on the generation of pornographic images using deep neurons. And when you read about what NetFlix writes about, how colleagues use crude empirics, sincere faith and a voodoo spell to measure and improve their recommender system - it becomes very fun (you must read the articles, if you don’t read, there’s real experience ).

In fact, it all comes down to “faith” and intensive and continuous AB testing, in which some customers pass by trained models, and some through machine learning algorithms.

Constant feedback is required.

We see that the intensity of content personalization is constantly increasing. Through our smartphones, the giants of the industry are intensively watching us and, apparently, but not provable, they are confident that the personalization of content - increases their profits and the costs of smart data-analytics analysts and GPU - are justified.

Known to us giants live with advertising. The more targeted it is, the greater their profit. That is why, it seems to me, they and not only they, but rather all the well-known IT-companies are more and more investing in data analysis and the use of machine learning algorithms to promote their products.

Requirements of continuous control over the effectiveness of content personalization for Clients with and without machine learning and, banal, monitoring the quality of work of trained models on real data is another project, more likely even larger in terms of resources and resources for implementation than the use of ready-made researched and laid out in open access algorithms and frameworks.

That's not all. Comparison of the work of algorithms, which, as you know, can take and stop working (better than they were before) - requires constant measurement of conversion and AB-testing in all structural divisions: from the marketing department to the customer service department. Otherwise, you will get a beautiful, modern, expensive (especially if you take up neural networks with a GPU) and a senseless toy and a set of gurus in a team who speak an incomprehensible language.

Findings. How to start online shopping jump into the outgoing train?

I hope it became clear that the major players in the market have long considered the growth of profits from the use of machine learning to personalize content, actively develop this area, hire the best minds and exploit the best hardware (GPU). Information on the topic of efficiency in the public domain - the day with the fire can not be found. But judging by the reports at conferences, 1-2 models from roughly 10 take off and allow not only increasing the conversion, but also getting around the competition - becoming closer to the Customers.
Until you start to measure conversions transparently and systematically, personalization using machine learning will most likely not help you and will even complicate business processes.
After creating clear and consistent conversion metrics, learn to conduct systemic AB-testing of online store changes made manually (the main page design was changed, the order wizard logic).
If your goal is to bring the marketing department to the next cosmic speed, then follow how the work with clients' email addresses is going on: in echos or in mind and other data. If the process can be automated - do it. Then try to do the segmentation with the help of machine clustering, make a test mailing and with the help of AB testing make sure in the result.
If your goal is to bring the customer service department to the orbit of Mars, likewise, make sure that the customer list is not in the exelcel or correspondence. Minimum - it should be in CRM. Then, using primitive techniques (Bayes classifier, logistic regression), try, with AB testing enabled, to start working with churn groups and attractive Clients using CLV analysis. Think carefully, involve in smart analysts - how to make sure that the implemented models have improved the conversion and relevance of the offers to Clients. It can be much more difficult - than everything taken together and done earlier :-) Secret: the conversion will be much higher, but the Nobel Prize application will prove it to Australopithecus.
After obtaining a sustainable result and building up basic processes in the structural units of the online store, you can think about increasing the quality of algorithms using neural networks or attracting more valuable analysts from datasentists.

Remember that you can only improve what you can measure!

And if you do otherwise?

People, unfortunately, most do not like to change. Who will improve something by 5%, if it works anyway and has worked 20 years before that and “everyone is doing it”. Understandably, by proposing in the accounting department to carry out a “Fisher discriminant analysis” the maximum that you can achieve is to increase the production of hormones among the most active women who love smart men. But mostly - you will be hate quietly :-) The only employee who most likely will not give up the extra 5% conversion and profits is the head of the sales department, but to prove to him that the content personalized using machine learning works is impossible of drugs . Only AB-testing in the form of a pink cat and a green puppy can persuade 20 years of people selling without AI. They say that sometimes white-painted girls with Fizmata help, reasoning about the charms of accurate measurements of everything and always - but it is very difficult to find these goddesses, many more goddesses who understand social networks better than their own brain.

There is a small chance that hard drugs will help convince the sales team to use personalized content.

Results

In general, colleagues working for 100% of the recipe to introduce new technologies for personalization and predictive marketing in the online store - no. Sometimes decimation will help. Sometimes - the realization of waking sleep from the beginning of the post. But it is better, of course, to act with patience and love - if a budget is allocated for them :-) From experience - you need help and support from above and a clearly defined goal: to increase the conversion and constantly measure and measure it. Loosen your grip - stay with an expensive toy and a bunch of new unknown words. Good luck to everyone and with the upcoming - we wish more continuous ABCDE testing in the new year!

Source: https://habr.com/ru/post/318632/

All Articles