Today it is often said that there is a demand for and even a clear shortage of specialists in the field of “big data”. And not only in our country, but also in Europe and the United States. Many universities have announced programs that promise to train such specialists. With some delay, they began this process in Russia, but it would be a long time before the country received professionals in a new specialty. And what if the professionals in the field of "big data" are needed right now? How does this experience develop and where is it acquired? What tasks have to be solved? We asked all of these questions to be answered by Anatoly Korzun, the architect of software solutions for processing “big data” (Big Data Solution Architect) at Huawei. Anatoly, you relatively recently came to Huawei as an architect of software solutions for processing “big data”. Where did you manage to accumulate the experience of a specialist in the field of “big data”, where do you train such specialists?')
AK: Yes, indeed, I came to Huawei relatively recently, in November last year. Before that, I worked for many years at Comverse, a company related to telecommunications. I worked mainly as a decision architect and as a head of the development department. Now this company is no longer there, it was bought by Amdocs.
When did you start practicing big data?AK: I had a break in the telecommunications sector, in particular, in 2012 I worked in a startup that was engaged in building solutions for advertising promotion on the Internet. This is the business that earlier than others turned to the technology of "big data". Such well-known companies as Google, Yandex, Mail.ru are working in this direction, but there are many start-up companies in this market. In many issues of advertising on the Internet without the technology of "big data" just can not do. At that time, I was in charge of a development team that was processing large data arrays using Hadoop technology. This technology was relatively new, and many issues had to be solved, as they say, by trial and error.
Before becoming a development manager in this startup, have you had any experience with working with “big data”?AK: Working in projects related to “big data”, I found many analogies with what I worked with before. And this is not surprising. It must be said that many of the ideas that are embedded in the solutions for the processing of "big data" existed before that, as the term Big Data was widely used.
For what tasks have you been invited to Huawei?AK: Huawei has its own platform for “big data”, which is targeted at large corporate clients, primarily from the telecommunications and financial sectors. This is a platform for processing “big data”
FusionInsight , on the basis of which solutions for specific businesses can be built.
It should be noted that Huawei is a provider of a wide range of solutions, starting with a basic level of hardware and ending with high-level business applications. And it can be said that the “big data” platform is one of the layers in this stack.
If we talk about applications based on FusionInsight, then it should be noted that for the telecom sector, Huawei offers a range of products and solutions. And in my opinion, the problem of Huawei is not that it lacks solutions, but rather, on the contrary, that there are a lot of them, and they are prepared by different departments of our huge company. There is a problem of overlapping functionality, which creates difficulties in choosing a particular product not only for the customer, but also for the seller.
And what products do you personally have to deal with?AK: At the Moscow office, we are focused on promoting the
Universe product, which runs on FusionInsight and allows a typical telecom operator to solve problems related to the processing of “big data”. At the end of April of this year, the FusionInsight cluster with the Universe product installed on top of this platform was deployed in the Moscow office. For interested customers, we can demonstrate the platform and product Universe.
That is, the cluster was built quite recently. Is there any interest from customers? Did you manage to initiate any projects?
There are a number of pilot projects for large customers. I cannot name specific names, let's say that these are two operators from the “Big Three” and the largest Russian bank.AK: One of the projects for the telecom operator is to build a solution that allows you to identify M2M devices based on the analysis of CDR logs (Call Data Records) that are generated by the network and other elements of the telecommunications infrastructure of the telecom operator.
And why do they need to be identified?AK: In a network of a telecom operator various terminal devices can be used. At the same time, some of them, such as telephones, smartphones, tablets, are used by a person, while others, such as a video surveillance camera, can function without human intervention and belong to the class of M2M devices. That is, there is a class of terminal devices, which, like a regular telephone, have a SIM card that allows you to go to a radio network, communicate with a base station, but are not a telephone. In principle, for such devices, operators could offer separate rates, since the traffic structure of these devices may differ significantly from the traffic of regular phones. The same video camera does not use voice traffic, but only transmits data. Moreover, data transmission can also have its own specifics. A camcorder is just one example. Such devices can be quite a lot. And the operator has a task: on the basis of the patterns of behavior of the serviced devices, determine their belonging to the category of “phones, smartphones, tablets” or “M2M devices”.
Why are the clients you are talking about turning to Huawei?AK: I would not say that the Big Three operators follow vendors. Rather, vendors who can offer something on this market are being besieged by telecom operators from various sides. Moreover, the leading telecom operators also have sufficiently powerful own resources.
I am less familiar with the capabilities of Rostelecom and TELE 2, but speaking about the Big Three operators, I can say that they have been involved in big data projects for at least two or three years and they have both technical means and personnel for building solutions in this area.
At the same time, “big data” is not the main business for telecom operators, but only a tool, and they are not so willing to talk about their decisions. And even speaking at numerous conferences, they do without specifics and, of course, do not disclose their know-how.
Can we say that Huawei solutions related to “big data” are primarily addressed to the telecommunications sector?AK: Huawei offers a complete telecom solution and a “big data” platform. The “big data” processing platform, like the relational database, can process different data and be used for different tasks. And on top of the platform for “big data” a specialized solution for telecommunication companies can be used.
What are the strengths of the Huawei “big data” platform?AK: This is a platform tested in practice in large Chinese companies. A large company by Chinese standards is really a large company that has the means to ensure the reliability, safety and security of data. Huawei customers include such giants as China Unicom, China Merchant Bank, Industrial Bank of China.
And what does Huawei do for banks in Russia?AK: I mentioned a pilot project for a leading Russian bank. They are considering the task of moving from standard data storage to data storage based on “big data” technologies. Now the ODS storage (Operational Data Store) of this bank allows you to store data history for one month. In the foreseeable future, due to the growth in data volume, ODS will only allow data to be stored for two weeks. The empowerment of the ODS is associated with substantial financial costs and fundamental technical constraints. And the target indicator voiced by the bank’s management prescribes that data be stored for seven years. That is, here, obviously, it is necessary to switch to the “big data” technology, which, at a relatively affordable cost, has almost unlimited (from a technical point of view) scalability. When designing the architecture of this pilot project, we had to solve a number of technical problems related to the integration of traditional relational databases and platforms for processing “big data”. Difficulties were caused by the fact that, firstly, interfacing with the “big data” platform should not affect the existing system. Secondly, the customer wants the change in their main system to be reflected in the historical archive in a mode close to the real-time mode.
What activities do you have to do directly in big data projects?AK: I would divide this work into three parts. The first is high-level presentations for customers. The second is working with customers to identify their needs related to processing “big data”, formalizing them into technical requirements, projecting these requirements into specific architectural solutions based on the FusionInsight platform. And the third is the prototyping and testing of specific solutions in our test lab.
Speaking about the specialists needed to work with “big data”, the specialization “data processing software engineer” (data scientist) is often mentioned, it is also noted as one of the most attractive and sought-after professions in the world. What is a data processing software engineer? Can you identify yourself with this profession?AK: A data processing software engineer is a person who works with data and whose task is to extract patterns from this data set. There are also specialists who service the lower level of the solution stack, that is, they provide technologies for storing and processing this data. But in practice, to draw the line between one layer and another is quite difficult. And one specialist often has to combine the knowledge necessary for both roles.
Speaking about my work, in some projects I speak more as an engineer involved in building a data processing solution, and in others, such as, for example, in the project I mentioned to identify M2M devices, I am also engaged in identifying patterns in "Big data", which allow to identify the device class M2M.
What advice can be given to today's youth, how to become an expert in the field of “big data”, how to acquire a scarce specialty that will allow you to do interesting work and which, as analysts predict, will only grow in demand?AK: From my point of view, anyone with a basic IT education can become an expert in the field of “big data”. There is such an opportunity in Huawei. You can come to work for a basic IT position, and, in principle, the company has a lot of growth prospects: you can take appropriate courses and gain experience in working on big data projects. Over time, it is possible to move from one department to another, where it is possible to focus more on projects for processing “big data”.