Mythical PCM Capture Extraction Tool: extract sound without contacting TAC

The mortal resource pcet.cisco.com unavailable to mere mortals

A reader faced with the need to capture PCM data on Cisco ISR routers easily finds an exhaustive step-by-step guide on this topic, but certainly stumbles over the item “the received file should be sent to TAC for further sound extraction”. Is it possible to do without it?

Prehistory

The customer’s telephone station was required to be connected to one well-known SIP-operator (public IP-address), and to the provider’s SIP-server that provided local communication (private IP-address). Unfortunately, stations of this model cannot simultaneously connect to SIP servers behind a NAT and to those on the private network, since the NAT traversal by substituting addresses in them is included for all SIP trunks without exception. The client used the Cisco 2811 as a router. Since the customer had a free ISDN PRI stream card in the PBX, they decided to purchase the E1 stream card and a voice processor to the specified router. Appropriate settings were made (the G.711 A-law codec was used everywhere) - and it all worked.

Problem

The situation is standard - everything worked fine, but one day it stopped. Employees of the customer began to complain about a certain signal during a conversation, "as if someone had pressed a button on the phone." This signal was quickly identified as a DTMF tone, which could sound at the beginning, middle or end of a conversation, or even several times in the continuation of such. It remained to find out who is to blame (or what is to blame) and how to correct the situation.
')
It should be noted that the composition of the equipment used provides an impressive set of tools for testing and debugging. In addition to the full range of debugging capabilities of Cisco, there is a very useful tool for real-time ISDN tracing, which is part of the management console of the PBX used. Unfortunately, in this case nothing helped, “catching” the given signal, that is, comparing it with debugging information, did not work because of the extreme irregularity of its appearance. Studying all the information on this issue led to a number of documents on the capture of PCM on Cisco equipment.

It is reported that the 2900, 3900, 3900e series routers and VG350 gateways make it easy and simple , and the result is a file containing voice data suitable for direct import into audio editors.

Unfortunately, the 2800 series customer router did not have the honor to belong to the noble families listed and did not have such capabilities. In IOS with versions below 15.2 (2) T1 (for the customer’s router such software version will never be released, because End of Support), a number of commands were required to capture PCM. If the event is expected, then you can wait for it and make a record:

voice hpi capture buffer 51200000 voice hpi capture destination flash:pcm.dat test voice port x/x/x pcm-dump caplog fff duration xxx

Or you can record everything at all until the event of interest occurs (be careful, flash memory overflow is possible!):

 dial-peer voice x voip/pots pcm-dump caplog fff duration xxx

This creates a file with voice and debug information that can be extracted from flash memory and copied to a TFTP server.

Another problem, in some ways worse than the first

This is where the fun began. Numerous citizens, perplexed by the seizure of PCM, poured their indignation on the technical support forums - they say, I dumped it, imported it into the audio editor, and this and that, anyway, it’s some kind of garbage! What am I doing wrong? To this, other citizens responded that the situation is completely normal; only powerful TAC specialists can extract the desired sound from the file using the PCM Capture Extraction Tool, available at pcet.cisco.com. I think I'm not the only one who, after watching a wonderful training video on Youtube , tried to follow this link and was cruelly humiliated, because it is impossible for mere mortals. Allegedly, the need to involve TAC is not with technical, but with legal aspects (I wonder what the legal aspect of ISR G2 differs from ISR?) - such decoding can be considered an illegal listening from a certain point of view, and the decoder changes too often, the version of the DSP used is difficult to determine, etc. It was possible to find an opinion expressed presumably on behalf of a non-Russian-speaking TAC engineer - they say, we don’t think about the mechanism, and the file is simply loaded into the system through the form on the page. There are a number of restrictions on the name and size, but the result is .WAV-Files, as well as some debugging information. In the form of a distributed distribution, this tool does not exist, and in general the main difficulty is to get the .DAT files themselves, so you, dear customers, get these files and send them to us, and we will extract everything that is necessary and not necessary from them.

However, it was impossible to involve TAC engineers in this issue - remember that the equipment is in the End of Support status? Accordingly, no service contracts and no support. True, TAC engineers pleased me by saying that my desire to solve this problem on my own, that is, to understand the structure of the file and extract the information stored there, is not a violation of anything and, if successful, will not deprive me of it for life. And on that, as they say, thanks.

I started by uploading one of the received files to a hex editor. What he saw was not very pleased, although there were familiar words (“C2800NM-ADVENTERPRISEK9-M, 15.1 (4) M10, IP | SLA | IPv6 | IS-IS | FIREWALL | VOICE | PLUS | QoS | HA | NAT | MPLS | VPN LE and 28.3.14, the latter representing the DSPWARE VERSION value from the output of the show voice dsp detail command. In addition, there was noticeable, so to speak, structure and frequency. At this stage, I decided to return to the test voice port command:

 C2811#test voice port 0/3/0:15.1 pcm-dump ? caplog Print to caplog, please enable banjo logger console Print to console, possible flood console disable Disable the message dump

In all the descriptions found, attention is paid solely to the caplog parameter. Why is the honored console so unjustly ignored? This should be corrected:

 C2811#test voice port 0/3/0:15.1 pcm-dump console ? <1-7> PCM stream index. Bit0:R_in=0x01 Bit1:S_in=0x02 Bit2:S_out=0x04

Everything is more or less clear. The next parameter encodes the numbers of the PCM streams to be captured. The R_in stream contains audio data transmitted from VoIP to PSTN. The S_in stream contains audio data transmitted from the PSTN to VoIP BEFORE DSP processing. The S_out stream contains audio data transmitted from the PSTN to VoIP AFTER DSP processing. Stream numbers can be added to get any desired combinations - you can capture any one, any two, or all three streams at once. However, I see no reason why it makes sense to capture not all three data streams at once - only with the aim of understanding the structure of the information entering the console. At the same time, you can run a perpetual capture (do not forget that there must be enough free space on the flash card), followed by a manual stop:

 test voice port 0/3/0:15.1 pcm-dump console 1

[conversation]

 test voice port 0/3/0:15.1 pcm-dump disable

You can immediately specify the required duration in seconds:

 C2811#test voice port 0/3/0:15.1 pcm-dump console 1 duration ? <0-255> capturing time in sec

Capturing the three channels in the console looks like this:

 047581: Jan 19 11:52:15.491: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 047582: Jan 19 11:52:15.495: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 00 00 07 00 09 00 0E 00 0E 00 0D 00 0D 00 03 FF FC 00 00 00 04 00 02 00 00 FF F9 00 03 00 0B 00 04 00 0B 00 0C 00 01 00 00 00 01 FF FE 00 06 00 02 00 03 00 07 00 06 00 00 00 05 FF FD 00 02 00 07 00 03 00 04 00 08 FF FF 00 02 00 04 FF FF FF F8 FF F5 FF F3 FF FA FF F8 FF F1 FF F1 FF F2 FF F4 FF EF FF EF FF EE FF F0 FF F8 FF FB FF F4 FF F3 FF FA FF F4 FF F2 FF F8 FF FF 00 04 FF FB FF F5 FF F1 FF FC FF FD FF FE FF F6 FF F3 FF F2 FF ED FF EA FF EB FF F7 FF F5 FF F2 FF F2 FF F6 FF FC 047583: Jan 19 11:52:15.499: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 01 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08

[and many more similar posts]

Capture single channel:

 027761: Jan 15 00:59:53.549: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00

[and many more similar posts]

Run the following commands:

 test voice port 0/3/0:15.1 pcm-dump con 1 duration 10 test voice port 0/3/0:15.1 pcm-dump dis test voice port 0/3/0:15.1 pcm-dump con 2 duration 10 test voice port 0/3/0:15.1 pcm-dump dis test voice port 0/3/0:15.1 pcm-dump con 4 duration 10 test voice port 0/3/0:15.1 pcm-dump dis test voice port 0/3/0:15.1 pcm-dump con 7 duration 10 test voice port 0/3/0:15.1 pcm-dump dis

Let us return to the structure of the received text messages. File for R_in:

 047648: Jan 19 12:10:48.525: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 00 00 18 00 28 00 28 00 18 00 28 00 28 00 18 00 28 00 18 00 18 00 18 00 18 00 08 00 18 00 18 00 08 00 18 00 08 FF F8 00 08 FF F8 FF E8 FF E8 FF E8 FF E8 FF E8 FF E8 FF E8 FF E8 FF D8 FF D8 FF E8 FF E8 FF E8 FF E8 FF D8 FF E8 FF E8 FF E8 FF E8 FF E8 FF F8 00 08 00 08 FF F8 FF F8 FF F8 FF F8 FF F8 FF F8 FF F8 FF F8 00 08 00 08 FF F8 00 08 00 08 00 08 00 08 00 18 00 08 00 18 00 18 00 18 00 18 00 18 00 18 00 08 00 18 00 18 00 18 00 18 00 08 00 08 00 08 00 08 00 18 00 18 FF F8 00 08

[...]

 048647: Jan 19 12:10:58.517: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 00 00 08 00 08 FF E8 FF F8 FF E8 FF F8 FF F8 FF E8 FF E8 FF E8 FF E8 FF F8 FF E8 00 08 FF F8 FF F8 FF F8 FF F8 00 18 FF F8 00 08 00 08 FF F8 FF F8 FF E8 FF F8 FF E8 FF F8 FF E8 FF F8 FF F8 FF F8 FF F8 FF E8 FF D8 FF F8 FF E8 FF E8 FF E8 FF E8 FF F8 FF F8 FF D8 FF E8 FF E8 FF F8 FF E8 FF F8 FF F8 FF E8 FF F8 FF F8 FF E8 FF E8 FF F8 FF E8 FF E8 00 08 FF F8 FF F8 FF F8 FF F8 FF F8 FF E8 FF E8 FF F8 FF E8 00 08 FF F8 00 18 00 08 00 08 FF E8 00 08 00 08 00 08 00 08 00 18 00 08 00 08

The six-digit number before the date is a unique identifier; in total, we have exactly one thousand messages with numbers from 047648 to 048647 inclusive.

File for S_in:

 050234: Jan 19 12:12:58.636: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 01 FF E8 FD 90 FE F8 FF B8 02 B0 01 C8 FF 48 03 D0 FE 98 08 40 FD F0 FD 70 FF D8 FD D0 00 C8 FC D0 04 E0 FA E0 02 10 04 A0 F9 E0 02 90 FD F0 FE A8 00 B8 02 50 02 70 FE E8 02 90 FD 90 00 F8 04 60 FA 60 00 A8 FC D0 02 30 01 E8 FD D0 03 50 FA 20 FF D8 01 78 FF 28 03 90 01 08 FB E0 01 78 FF 98 FE C8 FF A8 FF 38 06 A0 FA A0 00 A8 04 A0 FD 10 00 E8 FE 08 00 B8 FE D8 FD B0 01 98 01 78 01 D8 00 38 FF C8 FF 08 00 28 01 58 FE B8 01 D8 FE E8 FE 08 01 C8 FF 68 FF A8 00 08 00 38 FF 28 FE 48

[...]

 051233: Jan 19 12:13:08.624: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 01 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 18 00 08 00 08 00 08 00 18 00 18 00 18 00 18 00 18 00 18 00 08 00 08 FF F8 FF F8 FF F8 FF E8 FF E8 FF E8 FF D8 FF E8 FF D8 FF D8 FF D8 FF E8 FF E8 FF E8 FF F8 FF F8 00 08 00 18 00 08 00 08 00 18 00 18 00 18 00 18 00 18 00 08 00 08 00 08 00 08 00 08 FF F8 00 08 FF F8 00 08 00 08 00 08 FF F8 FF F8 FF F8 FF F8 00 08 00 08 00 08 00 08 FF F8 00 08 00 08 00 08 00 18 00 08 00 18 00 18 FF F8 FF F8 FF F8 FF F8 FF F8 FF E8 FF E8

Also one thousand messages exactly.

File for S_out:

 051234: Jan 19 12:14:00.396: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 02 00 06 FF F6 00 76 00 F6 00 36 FF C6 00 68 FF F8 FF 88 00 78 FF A8 FE B8 00 18 00 98 FF 98 FF C8 01 38 FF 98 FF 18 01 58 00 58 FE F6 FF A8 01 A8 00 E8 00 38 00 18 FF 08 FE 56 FF 06 00 A6 00 86 00 E6 FF 86 FF 58 FF D8 00 18 00 86 00 26 00 66 FF F6 FF E6 FF 56 00 56 00 76 FE F6 FF E6 00 16 FF C6 FF B6 FF C6 00 86 FF B6 FF 76 FF F6 00 A6 00 36 FF E6 00 88 00 08 00 38 FF E8 FF F6 00 18 00 3A 00 8A 00 2A FF BA FF 5A FF D8 00 18 00 58 00 5A FF 7A FF AA FF E8 00 18 00 18 FF C8 FF F8

[...]

 052233: Jan 19 12:14:10.388: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 02 00 C0 01 8C FD 52 01 00 01 C2 FE E8 FE BE 01 3E 01 60 FF 4E FE 56 FE 44 02 0E 02 14 FF E8 FF 46 FF B4 FE F6 01 64 00 C2 FC D2 03 84 00 84 FC BC FD E8 01 5E 04 66 FD B6 FE 62 00 60 FF C6 01 6E 00 50 FE AA FE AC 01 3C 02 2C FC E8 00 16 01 3E 00 38 00 EC FF 26 FF 40 00 26 01 30 00 22 00 20 FF 74 FE 9C 01 A8 01 16 FF A8 FD EE FF 84 01 E0 FE 9C 00 64 FF A4 00 7A 01 96 FE BC FF 44 00 86 FF E0 FE F8 01 8A FF 22 FF 00 02 12 00 5A FE 70 00 18 00 B0 FE D6 01 B6 FE 1A FF E6 02 6E FE 22

One thousand messages exactly.

File for all streams:

 052234: Jan 19 12:15:02.576: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 00 00 08 00 38 00 78 00 B8 00 18 00 18 FF F8 FF C8 FF F8 FF E8 FF B8 FF 58 00 08 FF 98 FF A8 00 58 00 48 00 08 FF 88 FF C8 00 28 00 48 00 68 00 98 00 68 00 A8 00 B8 00 18 FF 98 FF 68 FF C8 FF A8 FF 88 FF 78 FF 58 00 38 FF C8 FF 78 00 48 00 48 00 38 00 88 01 78 00 78 FF D8 00 38 FF A8 FF 78 FF E8 00 28 FF A8 FF 98 00 88 00 A8 00 18 FF 88 FF E8 00 78 FF E8 FF B8 00 58 00 58 FF 48 FF C8 00 48 00 08 FF C8 00 38 00 78 FF D8 00 28 00 68 00 18 FF B8 FF 78 FF C8 00 68 00 38 FF D8 FF A8

[...]

 055211: Jan 19 12:15:12.516: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 01 FF A8 01 28 FE 28 00 18 FF 78 FE E8 01 A8 00 08 02 10 00 48 FE C8 00 28 00 98 00 38 FF 68 FF 18 FE C8 00 A8 FF 68 01 98 00 08 FF C8 01 18 00 B8 FF E8 FF 08 FF F8 FE 18 00 28 FF 68 FE F8 00 E8 00 78 00 F8 00 78 FE E8 00 78 FE D8 00 48 01 A8 00 E8 00 B8 FF F8 FF C8 FF 58 FF 08 00 C8 FF B8 FF 58 01 D8 FE 78 00 08 FF 38 FE 28 FF E8 00 C8 FF 88 01 48 00 D8 00 68 00 08 FF 68 FF D8 00 68 FF B8 FF 58 00 78 00 08 00 68 FF A8 01 18 FE 58 00 18 00 28 FF E8 00 F8 00 28 01 08 00 B8 FE E8 055212: Jan 19 12:15:12.520: len=172, ch_id=1, pak_id=143, proc_id=0, <== Payload: 00 07 00 02 FF AC 01 055234: Jan 19 12:15:32.851: %ISDN-6-DISCONNECT: Interface Serial0/3/0:0 disconnected from 2606699 , call lasted 372 seconds

Here the picture is somewhat different. Obviously, for three streams of messages will be three times more than one. However, in the complete messages log, taken as an example, only 2978 pieces, plus one cut off at the very end. Some sources indicate that in cases where diagnostic messages go to the buffer faster than the router can output them to the log, some messages may be lost in whole or in part.

It can be noted that in each message the data begins with the same sequence of bytes (R_in: 00 07 00 00, S_in: 00 07 00 01, S_out: 00 07 00 02; the file for all streams contains messages with all the specified sequences).

Obviously, these four bytes are a kind of service header, which, when considered, can safely be ignored, since it simply indicates the stream to which the message belongs. This assumption is in good agreement with the fact that for a ten-second interval, a thousand of messages with 160 bytes of useful information each fall into the log, that is, 160,000 bytes of useful information. With a sampling frequency of 8000 Hertz, a time interval of 10 seconds, and with a sampling depth of 16 bits for one sound channel (mono), this amount of data is obtained.

First rough treatment

Using a text editor that supports regular expressions and a regular expression (......:). * (<== Payload: 00 07 00 01) delete rows (050234: Jan 19 12: 12: 58.636: len = 172 , ch_id = 1, pak_id = 143, proc_id = 0, <== Payload: 00 07 00 01) from the corresponding file. There are exactly 1000 replacements. After converting the ASCII data to binary in a hexadecimal editor (inserting an ASCII-HEX buffer into the file), we get a file of exactly 160000 bytes.

It should be noted that when this happens the following. The logs of the router get the data in the same order in which they are stored in its memory (big-endian order). That is, for example, the value 0x1234 is stored in two consecutive cells: 12 34. In this form, we copy the text into the buffer, in the same order the bytes fall into the binary file.

I dare to remind the reader that in the memory of personal computers with x86-processors, the order of writing is accepted from the low byte to the high byte (English little-endian, literally: “pointed”), in connection with which it is sometimes called the Intel order (after the name of the creator company x86 architecture).

At the same time, the standard for TCP / IP protocols is the byte order from high to low (English big-endian, literally: "stupid"). It is used in data packet headers and in many higher level protocols designed for use over TCP / IP. Therefore, the order of bytes from senior to younger is often called network byte order (eng. Network byte order). This byte order is used not only by IBM 360/370/390, Motorola 68000, SPARC processors (hence the third name is Motorola, Motorola byte order bytes), but also by processors of some Cisco routers.

Next, the resulting RAW file is imported into the audio editor. As a result, it is possible to get the correct sound file with the following parameters: Signed 16 bit, Big-endian, 1 channel (mono), frequency 8000 Hz.

Hear everything!

This is definitely the first little success. It should be developed - try to understand the structure of the .DAT file, after which everyone can write their own program in their favorite programming language, which will extract the sound from this file.

Upon closer inspection, the file found that the sequences in the file also alternate 00 07 00 0X, shifted by F4 bytes relative to each other. A double-byte word with the value 00 F4 (big-endian, do not forget this) is present in all cases with an offset of -0x0E relative to the sequence 00 07 00 0X. It can be assumed that it refers to the header of the data packet and indicates the size of the current packet (offset of the header of the next packet relative to the header of the current packet).

Considering the last voice packet in the file, you can see that the sequence 00 00 00 00 is present F4 bytes from the end of the file. Suppose that it indicates the beginning of the packet. Thus, a word with the value 00 F4 is located at the offset 0x42 from the beginning of the voice packet. At the same time, for each voice packet in the file between the sequences 00 07 00 0X and 00 00 00 00 exactly 0xA0 is placed, that is, 160 bytes of information, which agrees well with the dependencies established during the analysis of the text log. It can be assumed that this is 160 bytes of payload - voice data. In the illustrations, I highlight these bytes in purple to indicate their importance.

Analyze the headers:

Blue indicates the expected position of identifiers len, ch_id, pak_id, proc_id (Recall text log: 792308: Jan 22 16: 30: 42.458: len = 172, ch_id = 1, pak_id = 143, proc_id = 0, <== Payload :), the orange value in the parsed packets matches ch_id.

The value 172 = 0xAC, located at offset 0x48, corresponds to the size of the data portion from this offset to the end of the packet.

However, we are left with an unidentified array of bytes containing service information at the very beginning of the file. It is necessary to identify patterns, otherwise the solution cannot be considered acceptable.

Two zero-terminated lines are found containing text information about the router (shown in yellow-green color in the figure):

1) text 28.3.14 (corresponds to DSPWARE VERSION from the output of the show voice dsp detail command) is at offset 0x5 from the beginning of the file;

2) text C2800NM-ADVENTERPRISEK9-M, 15.1 (4) M10, IP | SLA | IPv6 | IS-IS | FIREWALL | VOICE | PLUS | QoS | HA | NAT | MPLS | VPN | LE) is offset 0xC0 from the beginning of the file .

You can also try to find in this file any particular values related to a specific router (such as MACs and serial numbers) or a packet (CRC). In the meantime, we should not go into research, since these data are of no particular interest to us.

At offset 0x42, the file contains the word 01 24, and at offset 0x0124 the first voice packet begins. This could be a coincidence, but checking on a number of files allows us to conclude that the pattern is confirmed for all files and packages in them. This is also true for the unknown purpose and content of packages that clearly do not contain voice.

So, the voice packet structure:

 packet_sign [0x00] 0x04   00 00 00 00 UNKNOWN [0x04] 0x3E  packet_size [0x42] 0x02  UNKNOWN [0x44] 0x04   00 00 00 00 len [0x48] 0x02  ch_id [0x4A] 0x02  pak_id [0x4C] 0x02  proc_id [0x4E] 0x02  UNKNOWN [0x50] 0x02   00 07 stream_id [0x52] 0x02  R_in 00 00 S_in 00 01 S_out 00 02 raw_data [0x54] 0xA0

Armed with the above, anyone who wishes can write a program in their favorite programming language, forming files from the DAT file that contain a voice — at least RAW — for later import into the audio editor and process, even though WAV.

Total

Of course, this is only a private decision that does not claim to be complete. For example, when using codecs other than G.711, it will be significantly more difficult to identify and unpack audio data. Other side effects will certainly appear. For these reasons, I do not publish the code with which the automation of input data processing was achieved.

These assumptions allowed us to easily separate the PCM data file obtained by dial-pin and containing several simultaneous conversations with different ch_id. Thanks to this, we managed to track down and identify the problematic DTMF, which was all the reason.

In addition, as it turned out, the Chinese GSM gateway was connected via SIP via the SIP. Subscribers of PBX heard tones only when calling through this gateway. It was possible to identify that the gateway, on which RFC2833 is enabled, transmits strange (it is not quite clear where they come from - whether they are generated by the gateway itself, or come from the operator’s network, but not from the remote subscriber) the packets that the router (on which in turn, RFC2833 is also included) it perceives as RTP NTE, after which it sends to the stream a full tone that PBX subscribers hear and for the detection of which all the manipulations that formed the basis of this material were done.

Since the gateway, due to its marginal cheapness, does not allow collecting debugging information and it was not possible to establish the degree of its fault, the gateway firmware was updated to the latest version just in case. It did not help, the signal continued to appear. Next, the gateway was switched to SIP INFO mode (on Cisco ISR routers, this mode is always enabled). No complaints yet.

Source: https://habr.com/ru/post/322856/

All Articles

Mythical PCM Capture Extraction Tool: extract sound without contacting TAC

Prehistory

Problem

Another problem, in some ways worse than the first

First rough treatment

Total

More articles: