📜 ⬆️ ⬇️

"Data Science, like mathematics and physics, is another way to learn about the world around you"

Habr, hello! We are continuing a series of interviews with Newprolab alumni, in which they tell about their history of transition to the field of work with big data. The stories are different and will be interesting to those who are thinking about changing their career trajectory or how new knowledge can help solve current problems. Meet Oleg Homyuk, Head of R & D in Lamoda.

Oleg told about his career path, values, why he chose Lamoda, and not the company in the Valley, about current projects, his team, about the most successful and most unsuccessful projects, about attitude to data science and much more.

image

- Oleg, how was your professional path to Head of R & D in Lamoda?
')
- It seems to me that any professional way is a consequence of several reasons and sometimes of accidents. Among these reasons there are several basic ones: the peculiarities of thinking, life values ​​and in general, as a person understands what success is. This understanding of success is the very vector that we use as a compass, choosing a professional path.

In this sense, everything worked out quite simply for me: the school clearly showed abilities in the exact sciences, constantly participated in olympiads, and even managed to take 3rd place in the 9th grade at the regional olympiad in mathematics among schoolchildren. On the whole, it has always been very interesting to solve puzzles, to look for patterns, I still love to be smart about tasks.

I liked to study at the university too: I graduated from the Moscow State Technical University. N.E. Bauman with a red diploma in the specialty "Optoelectronic Instrument Engineering", we were taught to design a fairly complex, in terms of physics and microelectronics, technology: thermal imagers, digital cameras, telescopes, even sniper sights, homing systems and night vision devices. I must say that this is an incredibly interesting profession, and our teaching staff was a star one. Such a real engineering at the junction of several areas of knowledge. Sometimes a little pity that did not work out on this topic.

- Why didn't it happen?

- On the last courses I was a little disappointed in what I was doing. It turned out that the demand for a profession in the country is not high, everything is very local, the best of the best engineers work mainly in institute laboratories, rare factories are able to realize the projects of engineers, equipment is outdated, and so on. There were, of course, some successes, but the scope was not the one that I imagined at the beginning of my studies. To this factor were added low rates for researchers, it was possible to engage in private carrying and earn more. There were, of course, other options to earn money, working not quite officially for Japanese companies, naturally without intellectual property rights.

At some point, my friends called me to work at a rather large Moscow region Internet provider, and I agreed. He was quite ready to learn new things; technical education gives a lot of space in this sense.

There I acquired new technical skills, became acquainted with the topic of quality management and in general touched world practice in this regard. There is such a standard for quality management, even a series of ISO 9000 standards, which offer some practices on the organization of processes in an enterprise, taking as an axiom the connection between the quality of the final product and how well the company internally manages its processes. The basic idea is that if you do everything within the framework of a standard, then the quality of your products is constantly improved, because you measure, think, plan, do, and again measure every process that can affect this quality. This cyclical constant improvement activity even has a name - the Deming cycle. I was somehow captured by this topic, such as management, but it is very mathematical.

As a result, I worked there for about 2 years, did various things, including managing a small department, built processes, communicated a lot with the quality department.

Next was Yandex. At some point I saw that they were hiring project managers in the search quality department. The vacancy itself is not so hooked, more interested in the test task: describe the existing problem of Yandex search and figure out how to solve it. Well, the trigger in the head on the word "quality" worked, probably. I worked 10 hours in a row on the task, it turned out to be several pages. As a result, they contacted me, called me for an interview and made an offer, which I gladly accepted.

While I worked at Yandex, specifically for me, everything fell into place, I saw how big data, mathematics, algorithms, focus on the user, his needs work together as a single mechanism and allow creating breakthrough products on the one hand and making money with another. It seems to me that I from Yandex took this formed desire to make products based on data and engage in machine learning. Since then, began to actively develop in this direction.

- It was 2011, the topic of big data was not very popular yet, there were no programs in particular. Where did you study, read?

- The content available was certainly not enough, but we were all so eager for knowledge. But Coursera was already and, by the way, the SCH too. I listened to Vorontsov’s lectures 15 times and did not understand anything. Many went through it, an interesting era.

In general, I began to move away from the topic of information retrieval little by little, I liked working with data, was attracted by a new area related to machine learning, and in 2012 I left the company.

- And what after Yandex?

After Yandex was “Consultant Plus”. Already more consciously chose the direction associated with data analysis. Just the data of user actions were just beginning to be extensively collected, so I joined this activity, started doing projects.

In general, it was an interesting time, now there are a lot of available libraries for machine learning, for example, xgboost, and we wrote our gradient boosting on C ++ trees, now, of course, not every team can afford this, and there’s no need - everything is already implemented. Such a story.

- Did you write or did you have a team on your own?

- The team has already been, yes, besides of the talents. In the second year of my work at Consultant Plus, we were joined by a talented student of the VMC, who in a couple of months wrote his implementation of boosting and began teaching models.

By that time, we were already aiming at forming a whole team of data scientists, we felt that there were many new possibilities in the data. Then the opportunity to take the team of two ShAD graduates, who probably knew a little more than me, and the developers to build the repositories, turned up very well. Everyone tried, worked mainly on the Hadoop cluster, although by modern standards there was not very much data.
At the peak of us, probably, only 9 people were there, they solved good problems. For example, we were looking for bursts of user interest in various topics, this helped the authors to more optimally approach the choice of topics for which it makes sense to write new material.

After that, I worked at Ezhome, a startup in Palo Alto. There, by the way, was recommended by Mitya Kataev, with whom I studied on the program “Big Data Specialist” . His acquaintance, Kirill Klokov, working as a development director at Ezhome, was just looking for a data scientist in the team. The main idea of ​​the company is the creation of the Uber-experience for home services; As a starting point, the care service for the local area was chosen - from lawn mowing to cleaning, planting plants and trees. As a result, I started working there as a Data Scientist, I really wanted to try my hand at a startup, and I wanted to work with my hands. I occasionally have this analytical itch, I want to do something meaningful myself, even though for a long time I have mainly focused on organizational processes. I used to hope that sometime the itch would subside, but no, I still try to “sit on two chairs”, that is, to develop both as a manager and as an expert.

- Even now?

- Even now. Although at the moment, of course, there is not enough time for a lot: a large team, a lot of management tasks, I am delaying at the weekend, there is now plenty of opportunities for this — kaggle, for example. I also want to do something with my own hands, but I have guys in the team who are clearly better than everyone in their field. But, in my opinion, for effective project management in the field of data analysis and the manager must have hard skills. I am constantly learning. Right now, for example, I decided to go through a programming specialization, just to remember what was happening.

- Coming back to Ezhome: why did they need a data scientist? What tasks did you have?

- This is a good question. At the very beginning I asked what result they expected from me. The answer was in the spirit: "we ourselves do not understand exactly yet, let's try." But a good task was quickly found: at that time, there was a bottleneck in attracting new customers, because each new application was processed by a person, measured the site from a satellite image, tried to understand how much the maintenance of such a site should cost. There was an expert linear model that dealt with this assessment. It is clear that the quality of the forecast wanted to improve, and how to take into account the greater number of parameters expertly, is no longer certain. That's where machine learning came in handy. We began to predict the time that the gardener would spend, using the parameters of the site. Parameters of the sites were taken from open sources, and “teachers” - from historical data. Then there was already a small base of active customers on a subscription at the weekly service.

As a result, the task was fired, for most of the incoming calls data was available, it was possible to form individual prices on the fly. Classical automation - robots work, people rest. Then I was invited to come to the head office in the Valley for a while, about a month and a half.

Before that, I worked remotely, there almost the entire remote team was there: the USA, India, Greece, Poland, Russia. The team was very cool, it was a pleasure to work. I managed to do a lot of cool tasks, in the end I was offered the position of the analysts team lead. We made some improvements in the infrastructure, which allowed us to increase many times the number of projects that we did. Then they offered to unite with another team that was engaged in developing software for building routes for employees: 5 thousand clients, 150 gardeners, how can you get around them in an optimal way. It was very exciting, and now it seems to me that the tasks that are more about computer science than about data are also very interesting.

- In parallel with Lamoda, you considered several proposals, why was the choice made in favor of Lamoda? What was critical for you?

- Yes, there were several proposals. What got me hooked in Lamoda? A clear strategy, understandable expectations from me, trust and a realistic resource plan in finance, that is, I have a clear task outlined to me: “we are here now, we need here, we want to develop R & D, we are ready to invest X, we expect such and such economic effect” . Everything. No reasoning about how spacecraft will surf the universe, or that all robots will replace. Plus, an honest story about how the company is doing. Everything was transparent, clear and this, in general, was bribed, because there was a complete feeling that I was joining the team of people who are really result-oriented and understand what they want. In addition, they gave me a blank check on the development of this area. For me it was some kind of personal challenge, I never had the opportunity to gather such a big team. Now there are 17 people and we are still growing.

- This is not the first company where you build a R & D department from scratch, you assemble a team. What are the first 5 steps you take when you come to the company?

- The R & D department was in Lamoda and before me, in 7 years even a few teams and managers changed. In addition, we have about half of the current team gathered inside. So not quite from scratch.

The first five steps in a new company? The algorithm, I think, is not specific to R & D, in principle, it can be so, if you come to a new company for at least some management position.

The first is to understand the current strategy of the company, to understand what goals the company has, what KPIs will be measured for achievements.

The second is to describe exactly how you can influence these KPIs, given your competence or role in the company, there must be some set of available tools and ideas. Describe the needs of the business and the target state, that is, what we generally want to come to, and then evaluate the available tools. Machine learning is only one of them, and not every task is optimal.

The third point - you need to conduct an audit of the current state - people, competencies, processes, data, products, infrastructure, especially infrastructure.
In general, it is only at the 4th step after the audit of the current state that it becomes possible to describe the further strategy of transition from the current state to the target one. Essentially, a lot of work, including a lot of consultations with interested parties, stakeholders, the results of which require the development of several possible development scenarios. In my practice it was useful to do at least 3 - conservative, realistic and aggressive in the sense of resource costs. Then everything is easier: after choosing a strategy, we make a roadmap, clarify the assessment of resources and get down to work.

- What is data science for you?

- Data Science is my favorite tool. This is an extremely exciting field, it is like math and physics, another way to explore the world around you. I first felt this clearly for the first time in Yandex, when we were engaged in the analysis of search queries, understood what the users needs, how they solve them, what generally happens in the world. That is, you can look at the world through a small crack of the data with which you work. This is interesting and, in my opinion, is no different from other methods of cognition, just another “channel”, consider that this is the 7th sense. The same thing happened in “Consultant Plus”: we looked at which users solved problems when they were looking for court decisions, that is, what exactly worries people, what disputes they have, that need to be resolved in court. If we talk about the data that we analyze in Lamoda, it is no less exciting. Especially when you find out that blouses and skirts are bought in different colors rather than the same. A curious observation with which you can go further in life. A lot of things you can learn about the world around you through the data. Therefore, I say that this is my favorite tool. And here he is, on the one hand, a cognitive tool, and on the other hand, an active tool, with the help of which you can create something new.

- If you take a business, what role in business do you assign to the data?

- Here the most important thing is not to succumb to HYIP. If we talk about business, then the data should certainly work. The results of data analysis should be profitable or reduce costs. If they don’t, then something went wrong somewhere. At the same time, a data-driven culture does not need to be understood literally, we can make decisions without relying on data, this is normal. Moreover, in some cases, the only way to do so.

- Tell me, what projects are you doing in Lamoda? What is the most successful project implemented by your team?

- Probably the first thing worth mentioning is a platform for A / B testing - in fact, a service that breaks users into groups and controls switching on and off experimental features. Why is this important to us? Because, in general, the field of machine learning itself cannot exist without constant testing of various hypotheses and ideas. We cannot know in advance that our users will like it more or less. Any new idea must be tested. Amazon provides interesting statistics, they say that 70% of the ideas they test lose the test. This should be treated calmly, even if the rate is higher. This means that in order to release 5 successful projects in a quarter, it is necessary to do ± 17. Therefore, a reliable platform for conducting controlled experiments is the basis without which it is absolutely impossible to move forward in terms of product development. Given our ambitious plans, it was necessary to make some upgrade to this system. Before me, the first version was made, we significantly updated it: now you can run more experiments at the same time, before there were some limitations in this sense.

- And what other directions?

- Search, and there are differences from major players like Yandex and Google, because we can work very well with our subject area, compared to the “universal search on the Internet”, it is rather narrow. It is impossible to make an ontology of everything, to describe all the interrelations, but in a small specific area you can make very good decisions that will work. We do our linguistics for the search engine, which could take into account some implicit relationships between different entities. For example, there are some brands that are grouped together, and formally, if you are looking for a thing of one brand, you can show the thing of the same brand, just another brand. As an example, Tommy Hilfiger and Tommy Jeans, in fact, this is one brand. Or understand that a stud is also a formal heel, and loafers are generally shoes. In general, we want to work our subject area very well, so our search will, among other things, be developed due to the expertise of the Lamoda staff.

Of course, one of the brightest examples of projects in which we are engaged is the ranking of products in the catalog. This is the same ranking in popularity. We try to make sure that the user who comes to the site, as soon as possible found what he likes.
There are also projects with recommender systems, pricing optimization, personalization, and a lot of things.

- Oleg, tell us about your most successful project.

- The most successful project is now the introduction of new ranking in the catalog. It has become a little smarter, beginning to take into account more interesting data. For example, we have solved the problem of context for unisex products, that is, in the context of the men's catalog of shoes, a well-selling product, and in the context of the women's - not so much. According to the behavior of users, it turns out that these are more men's shoes, although formally and unisex. Many of these nuances that I want to take into account. So we are not stopping, testing new hypotheses, trying to actively cooperate with the commercial department, and so on.

- How do you work on projects? How do you select? How long do you take into production?

- The statistics have so far been collected on this topic, but in general, our work is structured in this way: despite the fact that the organization is already quite large, there are more projects than people, therefore we collect a micro-team for each direction. For example, I have a separate micro-team that deals with recommender systems. The same people may be involved in other projects, this is normal. Everything is solved mainly within the micro-team, regular meetings and brainstorms, planning and retrospectives, as well as internal meetings and demos are held. No demo anywhere.

This year, the project takes 4-6 weeks from the idea stage to release. But it is clear that such projects are not all. Some require much larger resource investments, especially if you need to invest in architecture or do something completely new or long and expensive to integrate with other systems. The maximum period is about several months. If you need to improve something that is already working, then this can be done quite quickly, if building from scratch is a different job.

- You mentioned Amazon with their 70% failed experiments, and what percentage in Lamoda?

- I would rather call them unsuccessful than failing. These, of course, we have. But we believe that from any experiment there are only two ways - it is either success, or learning. We do not call unsuccessful experiments a failure. A real failure is when we did not learn any lessons from a project that did not bring economic benefits. If a new idea lost the current one or at least did not win, it means you need to thoroughly figure out why it happened, rethink the task and, possibly, do another iteration. Just need some knowledge to endure.

- Can you tell us about the project in your career that didn’t take off? About the biggest disappointment and learning'e that you endured.

- Yes, there are even a few of them. For example, I really wanted to introduce machine learning into search rankings in one of the companies. We spent a lot of time on this project, but as a result it turned out that there was simply no resources to implement such a solution, and the project had to be closed. For me, as a manager, it was very good learning, I am sorry that dear. Determining the boundaries of what is permissible (what we can do, what resources we have) is needed at the start, before at least a line of code is written, otherwise a similar situation could turn out. Moreover, the team did serious work, and when modeling at the stand, even good quality worked, but for the implementation architectural changes were required in the application, and for the sake of a single search, the company did not go for it.

- What does the team mean to you? For the year you have more than doubled your team, and you continue to grow. How do you pick people that matter to you?

- I consider one of my main achievements of the year of work in this company is that we have a really great team atmosphere: it is friendly, based on mutual support and respect, it’s important to keep it when expanding the team. Therefore, in addition to professional qualities, we, in particular, try to understand at the interview whether we work with a person or not. All successful candidates get acquainted with the team, this is important, I listen to the opinion of the team.

- Half of your team, like yourself, either went through the programs with us in Newprolab before joining Lamoda, or you sent them to study. Is this a coincidence or did you select people from the alumni community, from those with whom you studied, intersect at our events?

- I would like to say that I, of course, selected, but I think that these are coincidences, although randomness is not accidental. I would like to quote here Grisha Sapunov (teacher Newprolab - approx. Ed.) That the correlation does not mean causation, that is, does not guarantee the existence of cause-effect relationships. Now from the lyrics to the problematic. It seems to me that all the graduates of Newprolab are united by qualities that, including me, seem useful in a team. There is some third reason that affects, conditionally, the attractiveness of the program for the listener and the candidate for me. For example, greed for information and a high level of intrinsic motivation. A three-month course with a load of 3 lectures of three hours a week and 10 hours of independent work requires a person to have a certain temperament, and this is what I really like about the atmosphere that you have on the courses. Because it is quite similar to a normal workflow. And people who withstand this load have a head start in advance of those who are not conditionally prepared for such a regime, in general, there is a difference.

- Many people may argue here that getting a certificate for online programs is more difficult, they need motivation, maybe even higher, nobody pushes you, there is often no one to ask, you know everything.

- So we are a team, we have no goal, so that a person goes into himself for 4 months, for example, as in the specialization on Coursera, and works with himself, we have the task to work in a team. We helped each other on the program, we had chats, we communicated, everyone shared their solutions with each other. This is very similar to the workflow, we also work, we each do their part, for which he took responsibility, but at the same time everyone consults with each other, communicate constantly, this is teamwork.

image

- You and Petya Yermakov teach at the “Big Data Specialist” , other members of your team also teach, speak at conferences. Why is it needed, what does it specifically give you?

- For me personally, speaking is a way to communicate with the community and convey some of your thoughts to a wide audience. It seems to me that this is useful, because all the same we all have a little different understanding of what we are doing. And to show some kind of individuality of one’s own and to find like-minded people is very useful. If we talk about teaching, for me this is quite a new experience. What motivates me to do this? I see social responsibility in this: I learned to do something myself, teach another. It seems to me exactly the way it should be. Of course, this is inspired, among other things, by the presence of some problems in the education system, because often the people who teach are not practicing experts in this field. Therefore, it seems that it is necessary. And I also need this, because if I share my knowledge, I am thereby at least a small share, but I will still spur the community and industry to develop. I later need to work with these people; there is now a clear lack of expertise on the market; we need to help people.

- You worked in an American startup for a year and a half, you lived in San Francisco. Why not continue to build a career there in the States? Why did you choose to stay here?

- Now it sounds rather strange, but I don’t make a big distinction between “there” and “here”, that is, for me the territorial location is not so important, I don’t really understand people who say that you must go somewhere. When I went to the Valley, I expected some wow effect from the experience and level of specialists who work there. I did not see it. , , , . , Lamoda . - , .

— , ?

— Slack ODS, , , . , , , , , , .

— , , . , , ?

— , : , . , , data science, . , , - .

— , , .

— , . . , , . , , - - , . — , , , , . , - - . , , , , , , , - , .

, — , , , . , , . : , , , . Ezhome — : data scientist, -, , . , - . , . , .

That's probably why I love my job so much, it allows me to use my strengths to achieve goals that are valuable both to companies and to me personally.

Source: https://habr.com/ru/post/431124/


All Articles