Dear readers, thank you for the fact that our
first post , which was prepared by Andrey Seryogin, Director of the Department of Operation of Convergent Networks and Services of MTS, became the most read in 24 hours. We are very pleased that the topic was interesting for you. Thanks also for the many questions. And thanks for the topics of future posts that you suggested.
When we, together with Andrey Vyacheslavovich, began to prepare answers to your questions, we realized that the answers are too deployed - enough for a whole new post. That is why we decided to publish the answers in the form of a new post.
Recall that the speech last time was about our
Center for the operational management of a mobile network in Krasnodar , which we opened in 2012.
')
So, we answer your questions:
What makes a man who has 12 monitors?
A person who sits at 12 monitors, of course, is not engaged in online monitoring, since it is impossible to watch simultaneous on 12 monitors. He is engaged in expert tasks and is looking for a solution simultaneously in several systems. That is, when there is a problem, it is fixed and put to work, the operator begins to test the hypotheses, alternately entering different systems. You can, of course, put one screen on it and it will switch between minimized windows, but until you find the window you need, you will lose a lot of time. Much more convenient when you have 12 screens.
Operators of the first line, who work in the umbrella monitoring system, also have several monitors. Alarm messages from the main vendors of radio subsystems, switching, VAS platforms, etc. are displayed on one monitor. This is an umbrella monitor monitoring system monitor. To work with incidents there is a second screen. Mail can be opened on the third monitor.
How many engineers do you have in one shift?Replaceable engineers perform different functions. For example, on the first line, which is engaged in monitoring on the radio subsystem, we have 3 people working in a shift. We also have an expert link, which also works around the clock. There are shifts on the switching subsystem. Several shifts by type of equipment on the transport network. There are changes of the main operational duty officer and so on. So there are enough engineers in each shift.
What questions does Roskomnadzor address to you?Roskomnadzor mainly addresses issues of network operation during various emergencies, exercises, during preparation and holding of large events of national importance (forums, summits, etc.). Another topic of appeal is the discovery of carrier-roaming inter-network roaming. How it works: when a network of one operator fails, the operator has the right to apply to Roskomnadzor and ask the regulatory authority to appeal to other operators with a request to open roaming. To do this, we have worked out the appropriate procedure. Once a quarter are exercises. So, in the event of a drop in the network of one of the operators, the subscribers will not notice anything. They will continue to make calls in their network, but in fact they will use, for example, our network.

You know how the Far East is raining now. In order for the residents of the affected regions to have as few communication problems as possible, by order of Roskomnadzor, roaming is open between all mobile operators. I am proud to announce that at the emergency headquarters meetings, we mark our network as the most stable and having taken over the largest number of subscribers of other operators.
It would be very interesting to hear about network monitoring for the appearance of fake BSs and generally suspicious activity on the network. What new developments in this area do you have?I did not come across in my practice with fake BS. In principle, a fake BS cannot appear in the MTS network. The practical meaning of the appearance of such a BS in the network, from the point of view of providing communication services to subscribers, is not clear to me.
Perhaps potential fraudsters could find the use of such BS, but there are separate units involved in the fight against fraud.
By the way, now there are mini-cellular networks for emergencies. There is a switch, a controller, a base station, plus a telescopic mast with antennas. All in three suitcases. You came to an emergency zone where there is no cellular connection, deployed your network and created your SIM cards, registered a group of subscribers. These subscribers will be able to call each other. If you work a little bit, you can make a way out into the inner world, throwing the line of attachment to public networks. But here the meaning of such a network is just clear, and the legality of its deployment is provided by the state.
Incidents recorded in the order of 800-900 for a 12-hour shift. Please tell me how many of them are really emergency situations or maybe some of them disappear as an erroneous message?800-900 - this is really incidents. The most significant part of them is connected with the disconnection of external power supply at the base station. Incidents of the first category are solved in a period of not more than 4 hours. In general, no more than 40 working hours are devoted to resolving even the most insignificant incident.
What means are monitored? I see zabbiks, but, probably, he is there a bit finished to fit his needs. Why was he chosen instead of nagios \ cacti?Monitoring of the main equipment is carried out using an umbrella system. Although we have some equipment (mainly in transport), which, from the economic or technical point of view, is not advisable to connect to the “umbrella”. If it fails, then this link is easily identified through monitoring of the rest of the equipment. However, alternative monitoring systems are used and are beneficial.
Why is chosen zabbiks? Monitoring tools that are in the public domain are similar in functionality. Who likes what. A specific engineer liked Zabbiks, he once worked with him, he knows him. So it is rather a matter of taste.
Data merge into one data center or several? If one, then are not you afraid of becoming “blind”, if the monitoring center goes offline?We have a mobile network control center, as you know, in Krasnodar. This is where the monitoring specialists sit. Their jobs are virtualized, i.e. Physically, servers with monitoring and incident management software are located not only not in Krasnodar, but also located on two geographically dispersed sites. If operators for some reason cannot get to their center in Krasnodar, then they can sit in any other room and work via the Internet.
We have a special disaster recovery plan for this case. Operators can move to any hotel or, eventually, stay at home, download a virtual workplace for themselves and work. We even conducted exercises: during an emergency, people go to a neighboring region, sit down in one of the classrooms of our branch and work until there is an opportunity to return to workplaces in Krasnodar.

We also have a control center in Nizhny Novgorod. If the work of the center in Krasnodar is suspended, then Nizhny Novgorod will partially take over the monitoring. Despite the fact that the experts in Nizhny are engaged in “fix”, we trained them so that they could monitor the main elements and major accidents on the mobile network. In addition, the regions themselves can look after the main elements of the network themselves - their competence allows it. So, in any case, the switch, controller and base stations will be under control. We will not be "blind".