A group of researchers from the Massachusetts Institute of Technology, Google, the University of North Carolina and Johns Hopkins University have published the final version of the
report describing the method of recognizing key phrases in an encrypted VoIP stream with a variable bitrate. They declare that the average recognition accuracy is 50%, and for some phrases - up to 90%.
Recognition is possible by analyzing the bitrate of the encrypted stream, in which different sounds are encoded with different bitrates. For example, vowels are encoded with a high bit rate, and whistling and hissing are noise, for which the minimum bit rate is sufficient.
If VBR uses four bitrate speeds, then human speech becomes a “four-bit” stream of digital symbols, where each of the four symbols corresponds to one sound — at first glance, it is a fairly simple cipher. Probably, it is theoretically possible to make a statistical analysis of this stream by comparing the probabilities of combinations of letters with a database of all existing words and phrases in the English language. Such a database for several years was made by Google and laid out in open access. But in practice it is too difficult, although quite solvable.
In this scientific work, researchers do not use cryptography and do not analyze the resulting "code" in the dictionary. They simply demonstrate the fundamental ability to recognize phrases in any random voice. To do this, they took a group of volunteers and forced them to utter 122 sentences of the same length through a VoIP channel with VBR. Then the same sentences were made by another group of people via VoIP too - and the system selected the correct phrase from 122 options on average 50% of cases.
')
Although this method can hardly be used for practical purposes (the recognition accuracy is too low and it is unlikely that the system will at least somehow work on a full language basis, and not on a sample of 122 sentences), but it is a wonderful example of how cryptographic analysis finds leaks in information systems that are protected even by good ciphers at first glance.
The work was published in the paid section of the ACM Transactions journal (doi: 10.1145 / 1880022.1880029), but preliminary versions of this work can be found free of charge (
PDF1 ,
PDF2 with listings ): they were presented at thematic conferences in 2008 and 2009 . The method itself was discussed at Habré
in the summer of 2008 .