⬆️ ⬇️

How does the MTS Mobile Network Operations Center work?

Hello, dear habrovchane!



This is the MTS corporate blog. Here, our specialists responsible for communications and technology will write about the most interesting things. Let's start with Andrey Seryogin, Director of the Department for Operation of Convergent Networks and MTS Services. So, the word Andrei Vyacheslavovich.



image

')

Good day! Finally, my hands reached the text for Habr.



I want to share my experience with a professional and interested public. 4 years ago, we were the first in the country to open the Unified Operational Management Center for a mobile network, which is located in Krasnodar. Last November, we officially opened the Unified Fixed Grid Management Center in Nizhny Novgorod. Thus, we have completed the centralization of monitoring and management of all networks - mobile and fixed. And this is Russia's first experience of centralizing such a scale among telecom operators.



A few words about yourself. I graduated from the Faculty of Micro-Instruments and Technical Cybernetics of the Moscow Institute of Electronic Technology. My specialty is radio engineering devices, and I have been working in this field for 27 years, including 21 years in the MTS, where I was an employee of the emergency repair brigade, an operator of the control center and eventually became the head of operations. At work, we had to “climb” with a soldering iron into transceivers and computers, program in different languages, climb 70-meter communication towers and manage a large team, having completed a couple of MBA programs. But I have always been proud and proud that I am a communications engineer, and at any moment I am happy to plunge into any technical details of the work of our network.



image



In the first introductory post I will tell you how we, with the help of our unique centers, control the operation of networks and promptly eliminate emergencies.



How do we monitor network performance



As you already understood, the task of the unified network control center is to monitor the network operation, to manage the elimination of emergency situations on the network and planned work. Having two centers, our department is now fully responsible for both mobile and fixed network. Although the concepts of "fixed" and "mobile" networks have become conditional (we call our single network convergent and provide convergent services to subscribers). For example, we need to connect a large corporate client to broadband access. And in his office, the base station 4G is “shining”. In this case, we often access the network through the base station. That is, the office is a normal modem that takes a signal from a 4G station. Next comes the internal broadband network. Therefore, on the one hand, we provide fixed access, on the other - via the mobile network. And vice versa, it happens that a new base station at the facility to which we once built an optical line to provide fixed access services, we turn on through this optical line.



That is why we now perceive our two centers as one, using geographic diversity to increase survivability. For us it is important that the management of the mobile and fixed network is in the same hands, and not in different structures and at different levels, as it was before.



How we solve network problems



In the Operational Control Center we have an umbrella monitoring system installed. On the monitor of the operational duty officer from all over the country, there are reports of malfunctions of the network equipment or changes in its working conditions. The operative duty officer, after reading the message, must, within 15 minutes, rectify the fault on his own, using remote access to the equipment, or assign the incident to the solution to the engineers on site. The progress of resolving the designated incident is strictly controlled.



image



image



According to statistics, the most common cause of problems at base stations is the disconnection of external power supply. When external power supply is disconnected, the station automatically switches to battery operation, which is designed for several hours of operation. At the same time, as a rule, external power supply is restored fairly quickly. That is, an emergency situation is resolved by itself and does not require a response to it. Therefore, it is not always justified to rush to respond to such messages. It is worth waiting and assessing the ability of the base station to “resist” the absence of external power supply and not pull the engineer on duty out of the warm bed, and send it with the generator to the base station. Moreover, our experience has shown that it is possible to entrust this work to a robot, and he will unmistakably decide whether to connect a person.



image



Often the duty officer has to quickly analyze several alarm messages at once and group them into one. For example, in a specific place we have built three base stations. Between themselves, they are connected by radio-relay lines (two plates, like satellite ones, look at each other and connect base stations). If one “relay” fails, then the whole chain of base stations behind it, if there is no reserve, does not work, and the operator on duty receives messages about the failure of several base stations. It is foolish to start incidents at all stations, when you need to repair a separate “span”. Such alarm messages can automatically “collapse” into one message, but where it is impossible to programmatically configure it, the duty officer has to do it. All this is called alarm correlation.



image



90% of accidents are eliminated remotely from the control center, since almost all of the infrastructure we have is built on computers. But if you need to fix something with your hands, physically change the board, repair the antenna, then the field team is sent. I also answer for the organization of their work. Each region must have its two or three brigades so that we can get to the object in a short time. The range of the brigade reaches 400 km.



Are there many alarm messages per day?



Thousands of alarms are sent to the monitor on-duty monitor per day. Incidents recorded in the order of 800-900 for a 12-hour shift. 4 years ago messages were received by the operator every five seconds, 1200 incidents were recorded per shift. Now we have automated many things, improved the analysis of messages and partially automated the process of creating “incidents”. This allows us to control a much larger amount of equipment.



Do emergency situations affect clients?



Violation of customer service in the event of an “incident” is difficult to assess directly. And that's why. Imagine, in Moscow, base stations cost almost every 200 meters. And if you turn on the phone in the monitoring mode, you can see that in one particular place it catches the signal of several base stations at once. If one base station turns off, then you can still calmly make calls, send SMS, use the Internet. But for us it is still an incident. And we solve it promptly. After all, theoretically, one of the clients may be in the office with some shielded windows, where only one particular nearest station “finishes” to it (since the signal may penetrate a thousand times when entering this “iron box”).



But if you look at the overall network fragment traffic when you turn off one base station, it will not change. It is redistributed between other stations. Talking about a violation of customer service in this situation is at least wrong. I do not know in the world of a single company that could strictly calculate how much the service suffered and suffered in such a situation.



So in general terms, this is what I do professionally. What would you like me to talk about in your next posts?

Source: https://habr.com/ru/post/308044/



All Articles