Since Machine Learning was first spoken at Mindbox, the Big Green Button has become the overall goal. This is such a button in full screen, when clicked, everything works by itself and makes a profit.
In the analytical project “RFM” the goal is less ambitious - Small green button . You press, and the base is automatically divided into segments, which starts sending letters (for example).
To achieve the goal, we wrote an automatic RFM segmenter and developed a special report to visualize the results.
We tell how it all happened and why you can now do without analysts devote more time to less trivial tasks.
The result of the email distribution depends on the reach of the audience and the quality of the newsletter itself. Infinitely increase the coverage is impossible, which means you need to increase the quality. To do this, you need to personalize the newsletter, since all people are different and everyone needs something different.
Consumers are usually many, making an individual letter for each is difficult. To cope with the problem, marketers divide consumers into groups - segments.
You can divide in different ways. One option is RFM analysis .
That is, RFM analysis is a segmentation method. Segments are called non-overlapping consumer groups. RFM analysis suggests for each customer to distinguish three features:
Many marketing companies make and use RFM analysis. We including. In the article about RFM segmentation, they told us what report we can do and how it can help marketers.
The existing approaches to RFM analysis are almost the same for everyone.
Clients are divided into groups for each attribute. Usually there are no more than five such groups. Intersections of groups are called segments.
For example, when divided into four groups, 64 (4x4x4) consumer segments form for each of the three attributes, and 125 segments already form five groups.
The main difficulty is to define the boundaries of groups, because there is no definite rule on how to do this.
Consider the most popular approaches on the example of a single customer base:
Here we use only two dimensions (R and M) of three for ease of perception.
In our example:
In this approach, the separation is made on the basis of the values ​​of attributes. In our case, we distinguish three groups for spending: up to 5 thousand rubles, from 5 to 10 thousand and from 10 thousand. And three groups by the limitation of the purchase period: up to 80 days, from 80 to 160 days, from 160 days.
We get nine segments:
Advantages of the method:
Cons of the method:
With this approach, the division by each attribute is performed so that the same number of consumers fall into groups.
This is how buyers from our example are distributed (still divided into three parts for each attribute):
Advantages of the method:
Cons of the method:
The analyst studies the database and selects the correct separation.
Advantages of the method:
Cons of the method:
We decided to get rid of the shortcomings of the old approaches. To do this, I had to resort to Machine Learning algorithms.
Using clustering methods, we automatically determine how many customer segments are actually in the database and what those segments are. And with the help of the decision tree, we bring these segments to a convenient form for perception. How it works, we tell in a separate article about the device segmenter .
For the example above, we got this result:
To make all this convenient and understandable for marketers, we have developed a report that describes the results of segmentation conveniently and clearly (we think).
To get it, just press one button - and the system will do everything itself.
The report is placed on one page and consists of three tables.
The first table is a summary. It contains information on all segments of the database, obtained on the basis of RFM-analysis. Key indicators: consumer activity in the segment and their value.
Activity is determined by the duration of the last purchase, and the value - the amount spent.
Each segment belongs to one of the categories. In each category there may be several segments or none at all. The cells indicate the total number of consumers from all segments of the category.
The “Activity” and “Value” indicators form nine categories of segments. Another category: “Never bought”
PS Here, the expressions "Outflow" and "Risk of outflow" are used as abbreviations for "Long time not buying customers" and "Customers who bought an average amount of time ago" and do not mean outflow in the literal sense of the word. Similarly, “Active” is the designation for “Customers who recently made a purchase.”
In the example above, 80% of customers do not have purchases, almost a third of the high-valued ones are in outflow, another third are in the risk group.
Assessing the state of the base helps to choose a category with which it is important to work first.
To show how to use the report, we take high value customers, that is, customers who have spent the most money.
In the second table of the report are displayed: the size of the segments, turnover, that is, the amount spent by all consumers in the segment, and the average bill.
All customer segments are listed. For example, here is a list of customer segments having purchases:
To display only high-value consumers, use the filter.
As a result of applying the filter, we get seven customer segments with high value.
Based on this information, various conclusions can be drawn.
For example, segment 2 has a significantly higher turnover than others, with a moderate average check. This indicates a large number of consumer purchases in this segment and their high loyalty. Without fear of customer churn, they can send letters and tell, for example, about updates.
Now let's pay attention to the average check: the segment No. 7 with the largest average check is in the outflow, and the segment No. 9 with the second largest average check is in the risk group. Consumers from these segments are willing to buy large sums, but have not bought for a long time. It may be worthwhile to encourage them to take action using a promotional code or an information letter.
The study of segments is needed to understand which segments should be worked hard.
The last table shows the boundaries of the segments for each feature (R, F, M) and the average values ​​for them.
From this table it is clear that consumers from segment 2 do indeed have more purchases than others - an average of 12
We need to choose which segment we want to work with first. Suppose we are interested in the segments with the largest average checks: â„–7 and â„–9. Consider them in more detail.
In the segment number 7, customers have not made a purchase for almost a year - it will not be easy to return them. But, perhaps, it's worth a try, since on average consumers from this segment bought 2.1 times - this means that their first purchase did not disappoint. It is likely that a good discount will help them again become actively interested in the brand.
With the segment number 9 is easier - the average old purchase from customers of it is only three months, and the average number of purchases - 2.8. Most likely, these clients are quite loyal and do not require any action towards themselves. But you can send a letter with an advertisement or a small discount to remind you of a brand.
When the segments for further actions are selected, you can run the necessary marketing campaigns.
We created an automatic RFM segmenter and were satisfied - it takes 20 seconds for a person to get the distribution of the customer base by segment.
We are going to automate the setting up of marketing campaigns for the segments so that the person does not need to waste time on this.
Of course, it will be a pity that no one will need our report anymore, but technical progress does not spare anyone.
Source: https://habr.com/ru/post/420915/