In Netology, the direction of Data Science appeared in 2016. When we first started, there were fears: the sphere was new, the demand for companies' date for the companies was decent, but there was no large number of people willing to enter the sphere, and there were many free English-speaking resources for self-study online — so we risked.
But today, in the direction of already 10 courses on different specializations in working with data, and the number of graduates - more than 800. We decided to ask one of such graduates about his work with data, about how he came to the field, how he develops the Machine Learning direction Loko-Bank and what people looking for a team.
Vyacheslav Potapov, Head of Data Analysis and Machine Learning at Loko-Bank and a graduate of the Data Scientist course :
')
I graduated from MSTU. Bauman specialty "Spacecraft" and upper stages in 2011. After that, he spent 7 years working in various places as an analyst, database developer and storage architect. During this time I learned a lot about data processing and storage, but at some point I wanted to dive more into analysis - to understand what all these numbers mean, what I store and process.
I began to look for directions for growth: I studied related positions in IT, I looked at the level of wages in the field and what the greater demand was. There were many articles on Habr and videos on Youtube, to some extent they helped me to understand the essence of working with data and how my skills at that time could be useful.
Then I met Data Science (DS) and Machine Learning (ML), but the fundamental base was not enough. The field is very wide and when you watch some videos or articles, you get only a fragmentary knowledge, but on the whole there is no understanding of what the essence of the specialty is, what the directions, methods, tools are. This is how to read a thick math textbook for universities, but without explanation and practice, it will be difficult to apply this knowledge.
A colleague told me about Netologiya, where there was a large full-time program on Data Science, and I did not meet similar suitable offers in the Russian-speaking market. As a result, successfully disaccustomed and defended his thesis on the topic "Recognition of images using neural networks." As I remember, it was very difficult, I did not have the practice of solving full-fledged tasks, and very much wanted to do not just a training work, but a fully working project.
In parallel with his studies, he tried to solve problems with Kaggle and do projects for work.
Immediately after the course I began to look for a place where I could fully engage in data analysis, as it is difficult to combine the work of the BI system architect and practice in DS.
After a series of interviews, I chose Loko-Bank and DS.
It seems to me that Data Science, as an analogue of a scientific research institute, needs trust, patience and understanding of perspectives from the management.
In Loko-Bank, these prospects were seen - this is how I began working in the Digital Business block, which develops the direction of analytics.
What analysts and Data Scientist are doing at Loko Bank
Now the bank has a classic IT department that is responsible for infrastructure and data storage, other units use these data sources and set requirements for the integration of new ones. In total, the company with analytics employs about 40 employees.
At Loko-Bank, process automation, data analysis, and data-based economy building become company priorities. I hope that on the basis of the information we will be able to build sales more correctly, carry out a risk assessment and the entire business.
In the business unit, work with analytics is divided into two areas: classical analytics - BI, whose specialists analyze the company's planned and actual indicators, prepare reports on sales, balances, incomes and expenses, and the ML direction.
The direction of Machine Learning involves the creation of algorithms that, based on evidence from classical analysts, make predictions, generate new data and look for hidden dependencies and anomalies. This is the department I manage.
ML in the bank is just starting to develop. But I have a goal - to build the system in such a way that it helps businesses and allows using all modern approaches to increase revenues and reduce costs. You have to completely change business processes and look for ways to integrate machine learning tools into the existing IT architecture. This can be difficult, as the architecture was not designed yesterday, and some of the requirements simply did not fit into it.
For example, the requirements for collecting logs for clients to enter the mobile bank. For classical analytics, they are not needed, so they were never collected or stored. I explained that based on these logs, we can train the model to make predictions on the workload of the platform and see the relationship between the use of a mobile bank and the client’s profitability. And if it were not for the development of ML, such analytics simply would not exist, because no one would deal with this issue. We needed a kind of conductor who would explain how and why, gave directions, how to build architecture, how to collect data, how to build models, where to apply them.
With the introduction of machine learning, I want to build a culture of working with data in the bank as a whole: their collection, processing, and the integration of new sources. In parallel, we are already solving the tasks of forecast analytics for customers, we are engaged in their segmentation, in order to then optimize tariffs and increase sales of the company.
We are also engaged in financial monitoring, analyzing suspicious customers and transactions. Now the company spends a huge amount of human and financial resources on this task. And we want to simplify and make these processes more efficient.
If we talk about what has already been done now, then we began collecting and storing data, in particular the user logs, which I wrote about above. Now we store information on the history of changes in the client card in the FTS.
At the moment, we are developing a model to determine the negative behavior of clients (legal entity and individual entrepreneurs) and have already received the first good results. Score for one of the popular metrics - 0.86. From algorithms we use gradient boosting. In the near future we plan to achieve stability of its work, including by connecting additional sources. This model should help reduce the risks of the company and optimize the cost of finding unscrupulous customers.
What specialists are needed for ML direction?
Our team is just being formed, so now I try to take on generalists. Of course, a person may be more inclined to develop or, conversely, to business analysis, but nevertheless he must understand the whole process of creating a solution, understand his role in it. This is a good option for those who want to try themselves in different roles.
It is important that a person be able to solve real practical problems, at least he could explain the approach and the set of steps. At interviews, I try to give logic puzzles, well, I ask for a general understanding of algorithms and techniques, without mathematics.
Since I am an engineer myself, I try to look for people with engineering education in my team, although this is not a taboo. I know examples when people came to the profession without technical education.
Creating an ML solution is not a trivial task, so it is not enough just to take all the data, throw it into the algorithm and wait for a miracle. You need to be able to dive into the subject area, be able to communicate, ask and listen, somewhere these skills may turn out to be even more valuable than technical ones.
More specifically, the department is now primarily interested in Big Data engineers. Neural networks and xgboost are good, but first you need to find specialists who can collect correct, prepared data in large quantities. Without them, no machine learning will work. I need at least two people in this direction. But the company has many requirements for them: they need to know ETL tools, SQL and have experience in building storefronts and data warehouses, as well as be able to solve optimization problems.
It would also be good to add two analysts to the staff, preferably with experience in the banking sector. And although Data Science experts in priority, the scope can be any.
The main problem of the market is the shortage of people who are able to translate the needs of a business into a meaningful ML-task, and sometimes offer some kind of solution proactively.
To solve this problem, you need to understand both the business itself and the existing tools, and also have good soft skills to correctly present the solution to the problem. And such extremely difficult to find.
Where to develop
Since we are now only implementing ML in a business company, a number of decisions need to be made, on which further trust in the whole direction will depend. These decisions are related to the rationale for the existence of a department for business. Machine learning is now widely known, so there is a special interest in it.
After successfully implementing the ML tools within my department, we plan to expand the pool of tasks and staff of specialists to the entire bank.
A bank is, first of all, large data flows, a large customer base and, accordingly, a huge responsibility.
On the one hand, there are customers who want to get good service and save their data, and on the other hand, there are always people who want to access confidential information stores.
In my opinion, with the growing load and complexity of the processes, the delegation of part of the duties and functions to the machines is the only possible condition for the stable growth of the company.
And a person who wants to come in the direction of Machine Learning in the banking sector should be able to relate ML work tasks to the main goals of the bank in the first place.
Tips for those who want to come into the field of Machine Learning
First of all, it is worth answering the question what exactly you want to do, and after that you need to do it. DS is a huge area for development, and on the one hand it is good, but on the other - you can wander for a very long time and not come to something specific.
In the beginning, I would not recommend diving deep into math. Focus on solving practical problems and tools (libraries, methods). I was greatly helped by the experience of developing databases, cleaning and processing data, and primary analysis. In real work, it is the collection and preparation of data that takes up most of the time, and quality work in this direction will make it possible in the future to significantly improve the quality of ML solutions.
It's great that we live in a time when any information can be easily found. There are many courses in different directions, communities (ODS) in the network, conferences and workshops are held periodically. But you need to understand that ML is a young discipline, it is only being formed, and there is still no fundamental approach to learning. Therefore, the development path should be chosen carefully: study different training programs, arrange for yourself the right accents. I was lucky - I chose a course that met my requirements and expectations, and led to the development of a huge and promising direction in Loko-Bank.
From the Editor