On the fixing of negotiations of cellular subscribers

We all heard that cell phone talks can be recorded ... Here are some calculations / thoughts on this issue.

Is it realistic to build a system that records and stores all the conversations of all subscribers of a particular operator?

The GSM standard mobile network equipment allows you to record and listen to telephone conversations of a specific subscriber - or several dozen subscribers. From this fact it is very often made by the far-reaching conclusion that operators use similar functionality to record and store all the conversations of their subscribers for a certain period of time. In the event that a subscriber becomes the object of special attention of the special services, the operators, they say, can provide, upon request, the available records of conversations almost six months old.

')
Is it real?

For calculation, you can take publicly available data on the number of calls in the network of one of the largest Russian operators. This operator serves 51.5 million subscribers, who on average consume 134 minutes of voice traffic per month - most likely, it’s all about outgoing calls.

Thus, the total duration of calls of all subscribers for one month will be:

51.50 million x 134 min = 6.9 billion min

After processing and digitization in a mobile phone, the voice signal in the GSM network is transmitted as a digital stream at a speed of 9.6 Kbps . Thus, without additional processing, all calls of subscribers in one month constitute a fairly significant amount of information:

6.9 billion min x 60 s x 9.6 Kb / s / (8 b / B) = 514 B = 500 TB

Now you can think about how difficult (and costly) it will be to build and maintain a system that:

1. Records outgoing and incoming calls of all subscribers within a particular network;
2. Does it fast enough;
3. Allows you to easily find any conversation (by subscriber ID and date, for example);
4. Allows you to store information for at least three months;
5. Works with switching equipment of any supplier;
6. Does not create a significant additional load on signal and voice channels, switch processors and other network equipment, does not affect the ability of the network to handle calls;
7. Reliable enough - for example, it records at least 99% of conversations;
8. Enough safe - does not allow leakage of confidential information to persons who do not have special permission (sanctions of the prosecutor, for example).

What follows from this:

1. All conversations are recorded => The system should not be afraid that subscribers are mobile. That is, it must take into account that during a call, subscribers move between base stations, they can even go to the service area of another switch, they can redirect calls, combine them into conferences, answer several calls at the same time, and so on;
2. The system works fairly quickly => Any post-processing of voice performed before recording for storage should have time to work "on the fly";
3. The ability to find the recorded => The system must be equipped with an index / search mechanism and integrated with the system that stores basic data about the subscriber - to take into account changes in phone numbers and SIM-cards. Information about the calls made should appear in the system without significant delays;
4. Storage recorded for three months => Based on the above calculations, a minimum of 1500 TB of disk space is necessary (if the post-processing of the voice signal is not used);
5. Compatibility with solutions from different manufacturers => A specified interface is required, which is guaranteed to be available from all leading manufacturers of switching equipment;
6. Lack of significant load => It is impossible to load call processors in switches or channel equipment with uncharacteristic tasks like voice compression in MP3;
7. Reliability => Redundancy is required for communication channels, power supplies and hard drives;
8. Security => A centralized mechanism for managing and controlling access to recorded information is required.

In order to be aware of mobile subscriber movements and respond adequately to them, the system must be connected to switches (and not to, say, base station controllers), since it is switches that are responsible for all the functionality related to subscriber mobility. The network, capable of serving 50 million subscribers, will have around 50 average switches.

Now let's see if it is possible to make the system distributed by placing around each MSC not just a repository for the primary accumulation of information, but something more intelligent - for example, a node that satisfies all the stated requirements at once.

First, let us imagine one several possible scenarios of serving the call of subscriber A to subscriber B.

Scenario 1 : Throughout the call, subscriber A is in the service area of switch X, and subscriber B is in the service area of switch Y. At the same time, the conversation can be recorded on any of the switches, at our choice. If the recording is made on both switches, then two absolutely identical records will be obtained and, before being stored in the central storage, one of them can (and should) be thrown out.

Scenario 2 : Throughout the call, subscriber A is in the service area of switch X, and subscriber B is in the service area of switch Y, but the call passes through the intermediate switch Z. The script is very similar to the previous one, except that identical copies of the record will be three.

Scenario 3 : At the beginning of a conversation, subscriber A is in the service area of switch X, and subscriber B is in the service area of switch Y, and during a conversation they move: subscriber A moves to the service area of switch T, and the subscriber goes to service area of switch S. In this case, a complete recording of the conversation will have to be assembled from parts. There will be a total of four parts, and from them it will be possible to assemble two complete copies of the record in four possible ways.

If you have the honor that during a call both subscribers can use the conference call and call hold services, and there can be more than one intermediate switch (and they can change in the process of moving subscribers), then it becomes clear that, in general, to collect full conversation recordings need to solve the problem of correlating data from different sources. A similar work is performed by the billing system of inter-operator calculations, whose architecture can be taken as a starting point for the design of our hypothetical global listening system. It is necessary to make a choice between two options:

* Or collect all the accumulated data in a common centralized storage and process it there. This simplifies processing and maintenance, but requires significant processing power in the processing center;
* Or collect in the central repository only metadata about the call (who, to whom, when called) and correlate them in order to understand which parts of the call from which switches give the complete call record in the collection. Conversation recordings themselves can be stored distributed. The correlation results are used to extract parts of a particular conversation from a distributed repository. This approach reduces the computing requirements in each particular node of the system, but significantly increases its complexity and makes it difficult to maintain and “keep afloat”.

For further calculations, we assume that from the point of view of security and ease of maintenance, it is better to store and process information in one central place. But first, the data needs to be delivered there.

For simplicity, assume that the load on the switches is distributed evenly and continuously. Accordingly, 500 TB of conversations are distributed between 50 switches, and each accounts for 10 TB of voice traffic per month. To take such a quantity of information, you need to have a bandwidth channel:

10 * 10244 / (3600 s * 24 hours * 30 days) * 8 bits per byte
= 4241943 bps
= 32 megabits per second

Total, we write in the estimate of 50 such channels.

Further, in order to ensure proper quality of data storage, you need to have spare media in the amount of at least 5% of the used media to replace the failed ones.

How many hard drives will you need? From 3000 winchesters of 500 GB to 6000 (if we write everything without preprocessing and write each conversation twice - for the caller and the receiver). Accordingly, the stock is another 150-300 of the same hard drives annually.

In addition, it is necessary to combine such a number of hard drives in an indexed storage with some kind of user interface access. The storage should provide uninterrupted recording of conversations from all switches (at a speed of 1600 megabits per second), updating search indexes on the fly and servicing requests for searching and retrieving recorded conversations.

We will not delve into the details of the possible architecture of such a repository. Briefly list all that we have counted so far:

* Media: 3000-6000 hard drives with 500 GB each, or 3 times (roughly) less hard drives with 8 kbps mp3 voice compression - since the GSM codec already compresses the voice, more winnings cannot be achieved. Naturally, instead of “saved” hard drives, you need to add processors to perform compression;
* Infrastructure for building 50 communication channels at 32 mbps;
* Servers that provide the interface to the repository and its operation (indexing, searching for the necessary records, searching for and deleting old records, integration with the subscriber management system);
* Power and climate control for all equipment;
* Place in server rooms for equipment placement;
* Infrastructure for the operation of the entire system (attendants, warehouses, logistics ...)

It is clear that all this is technically feasible. The only question is economic feasibility.

What self-respecting operator will lay out money for all this "luxury" simply because he was strongly asked about it - given that there will be no return on such investments? As far as I know, there are no laws anywhere that would oblige operators to provide such a “service”, so that it can only be a “persistent request” from the state or law enforcement agencies, but no more.

Which operator has enough local technical expertise to build and support a solution of this magnitude solely by its employees? If it seems to you that it is very simple and cost-effective, think about why large telecom operators do not create all their own billing, financial and ERP systems.

Where are the suppliers and manufacturers of such ready-made solutions for those operators who can not develop such a system on their own? In the end, information on actually existing listening systems is not a secret behind seven seals - to be convinced of this, it is enough to search the Internet for the keywords “SORM” or “lawful interception”.

By answering these questions for yourself, you can independently decide whether it is possible to create a system that records all conversations on a mobile phone or not.

Source: https://habr.com/ru/post/132272/

All Articles

On the fixing of negotiations of cellular subscribers

More articles: