⬆️ ⬇️

The new codec Codec2 700C compresses speech to 700 bps



In the FreeDV digital radio program, you will soon be able to test a new codec at work.



The author of the free voice codec Codec2, designed for superdense speech coding on voice channels, has released a new version of the Codec2 700C , in which he managed to encode a distinguishable human speech of just 700 bps. This means that a three-second voice transmission with a distinguishable speech will take only 260 bytes.



Of course, such technologies are completely inappropriate for compressing music or other multimedia content, but this is an indispensable thing for communication in conditions of a strong restriction on the bandwidth of communication channels. For example, with digital transmission of sound from Mars.



Such super-dense compression can be useful not only for space applications, but also for amateur radio and for various military tasks, satellite communications, and encrypted devices. For example, the US Army now uses the MELP (Mixed Excitation Linear Prediction) coding standard, but this is the intellectual property of Texas Instruments (2400 bit / s MELP algorithm and source codec code), Microsoft (1200 bit / s transcoder) bit / s) and AT & T (noise preprocessor). The same proprietary standard MELP is used in satellite communications, secure voice communications and secure radio transmitters. Standardization and development of MELP was conducted with the support of the NSA and NASA.

')

It is obvious that people need a codec of a similar purpose, but free from patent encumbrances MELP.



Codec2 sound codec developer - David Rowe . He has been leading the project for several years. The first alpha version of Codec2 was released in September 2010 . This engineer had a hand in creating a free audio format for Speex speech coding, the development of which was discontinued in favor of the free Opus format. Then David set the task to achieve voice transmission in communication quality in a stream of 2400 bps and lower, that is, to make a free alternative to MELP.



β€œI continue to work on the development of a digital voice coding mode that can compete with single-band modulation ,” writes David Rowey. - For most of 2016, I was distracted from this work and was engaged in a paid project of a commercial high-frequency (HF) modem. But since December, I have been working again on the 700 bit / s codec. The goal is to ensure the quality is about the same as the current mode of 1300 bps. This can be used in a coherent PSK modem, and maybe in a 4FSK modem when tested on HF channels. "



It is appropriate to clarify that the PSK-modem is a device for a relatively new digital type of information transmission with narrow-band two-position phase modulation.



The author has done a lot of work to optimize the codec. The block diagram of the signal processing in the new codec is shown below. David Rowey writes that the key stage of this algorithm is resampling (resampling), when the time-varying amplitude of the harmonics is converted into a fixed number (K ​​= 20) of samples. At low frequencies, more samples are taken than at higher frequencies, which corresponds to the logarithmic perception of the human ear. Experimentally, David came to exactly K = 20.







The main part of David Rowey's work concerns precisely perceptual compression of sound in such a way as to best suit the logarithmic characteristics of human hearing, when the susceptibility to different frequencies changes according to a logarithmic law.





3D graph of the amplitude ratio in dB over time (300 frames) with the K = 20 resampling parameter for the frequency vectors for the hts1a sound sample (you can listen to it in the table below). You can see the change in signal over time and low values ​​at high frequencies, which are worse perceived by the human ear



In general, the problem of the lack of free codecs in the range of up to 5 kbps was raised by Bruce Perens in 2009. He contacted the Speex developers and invited them to study the situation. Codec2 is based on scientific papers of the 60s and 80s and does not seem to fall under existing patents. The sinusoidal speech coding was first mentioned in 1984, and Rowey himself described in detail the techniques of harmonic sinusoidal coding in his 1997 scientific work. The codec is published under the free license LGPL2.



On the samples below, you can compare samples of the previous version of the codec at 1300 bps and the new version at 700 bps.

Sample1300700C
hts1aListenListen
hts2aListenListen
forigListenListen
ve9qrp_10sListenListen
mmt1ListenListen
vk5qiListenListen
vk5qi 1% BERListenListen
cq_refListenListen
Each person has his or her own peculiarities of hearing, so the author asks to leave feedback: how do you perceive samples at 700 bps as distinguishable in comparison with samples at 1300 bps? David Rowey believes that they are about the same: some samples are slightly better (cq_ref), while others are slightly worse (ve9qrp_10s, mmt1). Artifacts are different everywhere. But in any case - this is almost a two-fold reduction in the band!



For comparison, here is a comparison of the alpha version of Codec2 v0.1 codec (2550 bps) from 2010 and the proprietary MELP codec (2400 bps).



Male voice:

β†’ Original

β†’ Codec2 v0.1 (2550 bps)

β†’ MELP (2400 bps)



Female voice:

β†’ Original

β†’ Codec2 v0.1 (2550 bps)

β†’ MELP (2400 bps)



In the coming weeks, David Rowey is going to open the 700C codec through the interfaces for the FreeDV digital radio program and conduct the first tests on the air.

Source: https://habr.com/ru/post/373067/



All Articles