Measurement of speech intelligibility: subjective methods

What's the matter?

We talk on the phone, we listen to speeches and speeches in concert halls. Some of us are trying to overhear someone else's conversation, others - to make wiretapping impossible. However, for all the cases cited above, there were situations when outside noise made it so difficult to hear what was needed, that the words became simply unintelligible. To avoid such problems, prior to the operation of the paths, an acoustic examination of voice communication channels is carried out.

A channel or a voice communication path (or voice information channel) is a physical medium, the path that the sound travels from sending to receiving. This may be an air, electro-acoustic, vibration, parametric, opto-electronic channel, but we will not stop at their consideration, since our goal is to measure the most important criterion of the quality of the path - speech intelligibility.

Methods for measuring speech intelligibility can be briefly submitted as a list:

Subjective:
- Purely subjective method;
- Objectivized;
- Tonal;
Objective:
- Formants:
  - AI (Articulation Index);
  - SII (Speech Discrimination Index);
- Modulation:
  - STI (Speech transmission index - Voice Transmission Index);
  - RASTI (Fast STI);
  - STIPA (STI for PA systems);
  - STITEL (STI for telecommunication systems);
- % ALcons (Estimation of Consonant Articulation Loss);

Of course, there are still Soviet methods of Pokrovsky, Bykov, Sapozhkov, but we will not touch them, at least for now, since the above methods provide the best result.

Of course, we can’t cover everything at once, so for the beginning we will consider the differences between objective methods and subjective ones, and we’ll also dwell on the latter in more detail.
')

Pure subjectivism

In the assessment of speech intelligibility by a pure subjective method, the pair speaker-auditor participates. Their work is convenient to consider the example of testing the radio station on the recommendations of the CCIR (International Radiocommunication Advisory Committee): on the transmitting side of the radio channel the announcer reads the text, while the auditor on the receiving side evaluates the path on a five-point (or some other) scale. It is hard not to notice the obvious big drawback of this approach, namely the inevitable influence on the result of the peculiarities of speech and hearing of people testing.

The solution to this problem is as obvious as the problem itself.

Objectification

The most common objectivized method is the articulation method. It lies in the fact that before starting measurements in the test channel, normal acoustic conditions (noise levels) are created. Several auditors participate, and the announcer, instead of plain text, reads specially compiled tables of syllables (articulation tables). The auditors record what they have heard and, at the end of the transmission session, verify their tables with the narration. The ratio of correctly heard syllables to their total number is an assessment of speech intelligibility, which is expressed in percent or in fractions of a unit.

Note that with a larger number of dictated and accordingly adopted syllables, the influence of various factors is averaged. The influence of factors is even more averaged if various groups of announcers and auditors take part in the tests. This is the objectivity of the articulation method. However, not only this. Obtaining objective results helps to recite exactly sound combinations that have no meaning, since when you receive words or phrases you can think out and restore the element distorted by the path.

As for the auditors, there is an opinion that these should be specially trained brigades, however GOST R 50840-95 requires quite the opposite. Personally, I am more inclined to the latter.

Articulation table example
ale	boo	vyr	sleep	onsa	ari	race	nyay
inci	to sow	cif	maybe	zhey	cheat	pam	earth
stro	panyu	kaf	shey	area	ide	vra	zhas
zym	lyakh	the un	not X	dis	alat	blah

Advantages:

Universality (the method is applicable to any type of path);
Simplicity (the method does not require special technical knowledge from operators)

Disadvantages:

The cumbersome measurement procedure (requires a significant investment of time, material and human resources);
Creating articulation tables (with each new type of table, measurement results are different);
Dependence of results on the degree of operator training;
The impossibility of automating the process;
Human factor (effect on the result of speech and hearing characteristics)

Objectification. Part 2

Consider another objectivized subjective method - tone, according to which the speaker is replaced by a pure tone generator. This artificial voice is in fact a regular loudspeaker without a diffuser that generates signals in such a way that the level of sound pressure generated at different frequencies corresponds to the curve of the spectrum of formant. Auditors go nowhere. Now their task is to simply determine whether a signal is heard at a given frequency or not.

Frequencies at which measurements are taken
250	500	650	800	990	1125	1300	1500	1700	1875
2050	2225	2425	2725	3100	3500	3850	4550	6150	8600

The sensation level of the formant is measured by smoothly introducing the attenuation until the audibility of the sound disappears, then the attenuation should be reduced until the sound appears. Two attenuation values are averaged - this is the measurement result.

Formant speech intelligibility is determined by the table:

dB	%	dB	%	dB	%	dB	%	dB	%	dB	%
one	0.04	ten	0.65	nineteen	1.92	28	3.22	37	4.28	46	4.75
2	0.09	eleven	0.76	20	2.07	29	3.37	38	4.37	47	4.78
3	0.14	12	0.89	21	2.2	thirty	3.51	39	4.46	48	4.8
four	0.19	13	1.03	22	2.36	31	3.64	40	4.52	49	4.82
five	0.24	14	1.18	23	2.5	32	3.75	41	4.57	50	4.85
6	0.3	15	1.32	24	2.65	33	3.87	42	4.62	51	4.88
7	0.37	sixteen	1.47	25	2.79	34	3.97	43	4.66	52	4.95
eight	0.46	17	1.62	26	2.93	35	4.08	44	4.69
9	0.55	18	1.77	27	3.08	36	4.18	45	4.72
dB - level of sensation of tone; % - formant speech intelligibility

The total formant legibility is defined as the sum of the components:

To complete the measurement of speech intelligibility, it is sufficient to determine the syllable intelligibility:

A	S	A	S	A	S	A	S	A	S
five	five	25	46.2	45	75	65	90	85	98
ten	15	thirty	55	50	80	70	92.5	90	99
15	26	35	62.5	55	81	75	95.2	95	99.5
20	36	40	69	60	87.2	80	96.2	100	100
A - formant speech intelligibility; S - syllabic speech intelligibility

Advantages:

No brigade of announcers is needed;
The measurement time is significantly reduced;
No articulation tables needed

Disadvantages:

Increased requirements for technical education of measuring personnel;
The impossibility of automating the process;
Human factor

But what about ...

... the differences of objective methods from subjective ones? I think you have already guessed that the whole thing is in the human factor, and more precisely in its absence, since they use an artificial voice, mouth and ear for measurements.

Consider the simplest objective method.

First of all, at the receiving end of the test path, a noise level is created corresponding to the working conditions. Next, the noise level is measured at the output of the artificial ear in the critical frequency band of hearing, while the average frequency of this band is equal to the frequency of the measuring tone. This noise level must be fixed, we still need it. After that, instead of noise, a tone signal is sent to the input of the path. The intensity level of the sound on the microphone is taken so that with the conditional zero on the damper, the distribution of sound pressure corresponds to the curve of the spectrum of formant. Next, using damping control, it is achieved that the level of the tone signal at the output of the path becomes equal to the noise level recorded earlier by us. Attenuation regulator readings - measurement result.

To determine the formant and syllable intelligibility, they use the same methods as in the tone method.

Advantages:

Accuracy and speed;
Speakers and auditors are not required;
Ability to fully automate the measurement procedure

Disadvantages:

Increased requirements for technical education of measuring staff

The end

Usually, after these words, there should be nothing, well, except the titles, so I will be very brief: it was “nickname_below”, watch us at any time convenient for you. And thank you for your attention!

Source: https://habr.com/ru/post/127064/

All Articles

Measurement of speech intelligibility: subjective methods

What's the matter?

Pure subjectivism

Objectification

Objectification. Part 2

But what about ...

The end

More articles: