Speech acoustics

akutek.info

The www center for search, research and free sharing in acoustics

To provide for adequate speech transmission and satisfactory perception of speech is important in many different fields. Voice acoustics is important to speech, in particular the production of vowel sounds.

The topics related to performance are, to mention a few:

· Theatre (drama) performance acoustics

· Opera acoustics

· Lecture halls and teaching facility acoustics

The acoustical issues are often related to the aim for:

· Adequate speech level (with or without amplification)

· Sufficient speech intelligibility

· Evenness of level and speech intelligibility (over time and space)

· Acceptable background noise level

· Reverberation: Early reflected sound may reinforce speech in a constructive manner, while late reflected (reverberant) sound often acts detrimentally by adding signal-related noise

· Disturbing echoes should be eliminated

It is common to use the single number parameter Speech Transmission Index STI to measure and/or predict the speech intelligibility in a room. Subjective quality is commonly associated with the following STI-values:

0.00-0.30 Unintelligible

0.30-0.45 Poor

0.45-0.60 Fair

0.60-0.75 Good

>0.75 Excellent

In Figure 1, computed STI values are indicated by colors, where higher values are represented by warmer (yellow to red) colors and lower values by colder (green towards blue and black) colors. Note how STI-value distribution in the auditorium depend on direction of speech, as speaker speaks to the right (top), forward (middle), and to the left (bottom). Normal speech level and 30dBA background noise. Speech spectrum, directivity and room acoustical properties are taken into account.

Speech acoustics

Fig 1. STI in computer model (ODEON 8.5) of a theatre.

D50 and U50

When background noise is not disturbing, the subjective speech intelligibility is often described by Definition or Deutlichkeit, denoted D or D50, defined as the ratio of the early received sound energy (0-50ms after direct sound arrival) to the total received energy. Values of different frequency bands will have to be weighted somehow to achieve a single number. ISO3382 suggests the single number D to be obtained from the average of the 500 and 1000Hz octave bands.

Whenever background noise is an issue, the Useful-to-Detrimental index U50, suggested by Bradley (1986) may be an alternative to STI. U50 is obtained from D50 and the signal to noise ratio s/n by a formula based on the assumption that late (>50ms delayed) reflections together with the background noise is detrimental, while the early (<50ms) energy portion is useful.

Direct sound, D-R ratio and our brain

Griesinger points at Audibility of direct sound as a key to understanding clarity of music and speech. Griesinger demonstrates with audio examples that speech transmission can be bad even if common speech parameters (like C and STI) show very good values.

Regarding the importance of good speech transmission in lecture and classrom acoustics, Griesinger refers to results by SanSoucie (2010):

”When the brain must devote working memory to decoding speech, there is not enough memory left over to store the information. ”

Contributed paper: Speech Intelligibility Measurements in Auditorium by K Leo

More computational acoustics on www.akutek.info/research_files/computational_acoustics.htm

External resources on Speech Acoustics:

Center for Speech Technology

Vocal tract study

More akuTEK:

akuTEK research startpage

akuTEK home


search engine by freefind	advanced

on site