⬆️ ⬇️

Measurement of speech intelligibility: subjective methods

What's the matter?



We talk on the phone, we listen to speeches and speeches in concert halls. Some of us are trying to overhear someone else's conversation, others - to make wiretapping impossible. However, for all the cases cited above, there were situations when outside noise made it so difficult to hear what was needed, that the words became simply unintelligible. To avoid such problems, prior to the operation of the paths, an acoustic examination of voice communication channels is carried out.



A channel or a voice communication path (or voice information channel) is a physical medium, the path that the sound travels from sending to receiving. This may be an air, electro-acoustic, vibration, parametric, opto-electronic channel, but we will not stop at their consideration, since our goal is to measure the most important criterion of the quality of the path - speech intelligibility.



Methods for measuring speech intelligibility can be briefly submitted as a list:



Of course, there are still Soviet methods of Pokrovsky, Bykov, Sapozhkov, but we will not touch them, at least for now, since the above methods provide the best result.



Of course, we can’t cover everything at once, so for the beginning we will consider the differences between objective methods and subjective ones, and we’ll also dwell on the latter in more detail.

')

Pure subjectivism



In the assessment of speech intelligibility by a pure subjective method, the pair speaker-auditor participates. Their work is convenient to consider the example of testing the radio station on the recommendations of the CCIR (International Radiocommunication Advisory Committee): on the transmitting side of the radio channel the announcer reads the text, while the auditor on the receiving side evaluates the path on a five-point (or some other) scale. It is hard not to notice the obvious big drawback of this approach, namely the inevitable influence on the result of the peculiarities of speech and hearing of people testing.



The solution to this problem is as obvious as the problem itself.



Objectification



The most common objectivized method is the articulation method. It lies in the fact that before starting measurements in the test channel, normal acoustic conditions (noise levels) are created. Several auditors participate, and the announcer, instead of plain text, reads specially compiled tables of syllables (articulation tables). The auditors record what they have heard and, at the end of the transmission session, verify their tables with the narration. The ratio of correctly heard syllables to their total number is an assessment of speech intelligibility, which is expressed in percent or in fractions of a unit.





Note that with a larger number of dictated and accordingly adopted syllables, the influence of various factors is averaged. The influence of factors is even more averaged if various groups of announcers and auditors take part in the tests. This is the objectivity of the articulation method. However, not only this. Obtaining objective results helps to recite exactly sound combinations that have no meaning, since when you receive words or phrases you can think out and restore the element distorted by the path.



As for the auditors, there is an opinion that these should be specially trained brigades, however GOST R 50840-95 requires quite the opposite. Personally, I am more inclined to the latter.

Articulation table example
aleboovyrsleeponsaariracenyay
incito sowcifmaybezheycheatpamearth
stropanyukafsheyareaidevrazhas
zymlyakhthe unnot Xdisalatblah


Advantages:



Disadvantages:



Objectification. Part 2



Consider another objectivized subjective method - tone, according to which the speaker is replaced by a pure tone generator. This artificial voice is in fact a regular loudspeaker without a diffuser that generates signals in such a way that the level of sound pressure generated at different frequencies corresponds to the curve of the spectrum of formant. Auditors go nowhere. Now their task is to simply determine whether a signal is heard at a given frequency or not.

Frequencies at which measurements are taken
25050065080099011251300150017001875
2050222524252725310035003850455061508600


The sensation level of the formant is measured by smoothly introducing the attenuation until the audibility of the sound disappears, then the attenuation should be reduced until the sound appears. Two attenuation values ​​are averaged - this is the measurement result.



Formant speech intelligibility is determined by the table:

dB%dB%dB%dB%dB%dB%
one0.04ten0.65nineteen1.92283.22374.28464.75
20.09eleven0.76202.07293.37384.37474.78
30.14120.89212.2thirty3.51394.46484.8
four0.19131.03222.36313.64404.52494.82
five0.24141.18232.5323.75414.57504.85
60.3151.32242.65333.87424.62514.88
70.37sixteen1.47252.79343.97434.66524.95
eight0.46171.62262.93354.08444.69
90.55181.77273.08364.18454.72
dB - level of sensation of tone; % - formant speech intelligibility


The total formant legibility is defined as the sum of the components:





To complete the measurement of speech intelligibility, it is sufficient to determine the syllable intelligibility:

ASASASASAS
fivefive2546.2457565908598
ten15thirty5550807092.59099
15263562.555817595.29599.5
203640696087.28096.2100100
A - formant speech intelligibility; S - syllabic speech intelligibility


Advantages:



Disadvantages:



But what about ...



... the differences of objective methods from subjective ones? I think you have already guessed that the whole thing is in the human factor, and more precisely in its absence, since they use an artificial voice, mouth and ear for measurements.



Consider the simplest objective method.



First of all, at the receiving end of the test path, a noise level is created corresponding to the working conditions. Next, the noise level is measured at the output of the artificial ear in the critical frequency band of hearing, while the average frequency of this band is equal to the frequency of the measuring tone. This noise level must be fixed, we still need it. After that, instead of noise, a tone signal is sent to the input of the path. The intensity level of the sound on the microphone is taken so that with the conditional zero on the damper, the distribution of sound pressure corresponds to the curve of the spectrum of formant. Next, using damping control, it is achieved that the level of the tone signal at the output of the path becomes equal to the noise level recorded earlier by us. Attenuation regulator readings - measurement result.



To determine the formant and syllable intelligibility, they use the same methods as in the tone method.



Advantages:



Disadvantages:



The end



Usually, after these words, there should be nothing, well, except the titles, so I will be very brief: it was “nickname_below”, watch us at any time convenient for you. And thank you for your attention!

Source: https://habr.com/ru/post/127064/



All Articles