📜 ⬆️ ⬇️

How to implement machine learning technology in your business

According to Gartner, machine learning is at its peak. Being engaged in the development and implementation of solutions in the field of data analysis and machine learning, our team DATA4 has accumulated experience in key stages and pitfalls, which I will share in the article.



Consider the stages of implementation:


1. Statement of the problem


Any technology must solve specific business problems. To describe all the applications of machine learning will require a separate article, but there are several main areas. These are predictive analytics (scoring, outflow, determining the best offer, related products, etc.), text analysis (reviews on the Internet, moderation of content, topic of appeals, etc.), speech analytics and video analytics.

For successful implementation, it is necessary to determine which KPI business we are improving, how and by which metric we measure the result.
')

2. Collection, storage and preprocessing of data


When the task is set, it is necessary to create a training sample (unfortunately, most business problems are solved by “learning with the teacher”). In our experience, sampling is the longest stage. To reduce it, a company must have a culture of working with data.

In addition to data collection, it is necessary to clear them and identify the features that affect the final result.

3. Learning Algorithm


Development of the algorithmic part of the most interesting, but also the fastest stage. It usually takes from several hours to several weeks of work.

4. Development of high-level strapping


The solution should be clear not only to a specialist in data analysis, but also to a programmer or administrator who will implement this solution. And if this is a highly loaded solution, or a solution with increased security requirements, you may have to rewrite it from Python to another language.

5. Integration


As a rule, it takes a lot of time due to the need for additional communications and approvals. This stage is best performed by the internal forces of the customer team.

6. Collect feedback, adjust the model


The world is constantly changing, not all features can be considered at the beginning of development. Collecting feedback helps to retrain models in a timely manner. Ideally, at this stage, the cycle is restarted, but with less time.

Features of solutions based on machine learning:


  1. Machine learning is based on statistics, and when the algorithm gives a wrong prediction, this is normal. It is better to immediately explain to the business customer, according to which metrics the quality is evaluated, what these metrics mean (not everyone knows what the F measure is and Roc-Auc), and that you can set 3 examples and see the result, it's interesting, but not statistically significant .
  2. Bad predictable result. The data does not always contain a useful signal, and it is impossible to predict the result accurately in advance. We usually take data, build simple models, and based on them we say what result it is possible to achieve. This problem does not apply to some classic tasks (recognition of faces, speech, etc.).
  3. Machine learning is the “last mile” technology, not a silver bullet from all problems. If the sellers do not take the phone from the customer and do not call back customers, then there will be very little point in introducing speech analytics.
  4. The main time is spent on integration, and data collection and processing, rather than on learning the algorithm (with rare exceptions).

Options for working with third-party developers:


  1. Payment by the hour. Only suitable for rapid prototyping and MVP. But not suitable for solutions that require further support.
  2. Contract development. Intellectual property is transferred to the customer, support is possible, but it is necessary to prescribe TK carefully.
  3. Payment from proven effectiveness. From personal experience in DATA4, the case is too complicated from the point of view of harmonization, which is practically not used in practice.

Alternatively, you can use ready-made platforms IBM, Microsoft, etc., but in practice it is expensive when used continuously, it is not always possible to implement a specific case using ready-made tools, and there are restrictions on what data can be sent there.

Conclusion


Machine learning technologies increase business efficiency, but we must remember that in order to make a complete solution, it is not enough just to train the algorithm, but it is necessary to prepare the data and integrate the solution with internal systems. And be prepared that the result will depend on the quality of the training sample.

Source: https://habr.com/ru/post/417009/


All Articles