Hartmut Traunmüller (1994) "Conventional, biological, and environmental factors in speech communication: A modulation theory" Phonetica 51: 170 - 183.

Also in PERILUS XVIII (1994): 1 - 19 (Dept. of Linguistics, Stockholm University)

ABSTRACT. Speech signals contain various types of information that can be grouped under the headings phonetic, affective, personal, and transmittal. Listeners are capable of distinguishing these. Previous theories of speech perception have not considered this fully. They have mainly been concerned with problems relating to phonetic quality alone. The theory presented in this paper considers speech signals as the result of allowing conventional gestures to modulate a carrier signal that has the personal characteristics of the speaker. This implies that in general the conventional information can only be retrieved by demodulation. In order to perceive the phonetic quality of a speech signal, listeners evaluate the deviations of the properties of the signal (F0, formant frequencies, etc.) from those they expect of a neutral vocalization produced by the speaker with properties given by his age, sex, vocal effort, speech rate, etc. In degraded speech signals, this is shown to result in a perceptual bias towards neutral vowels. It is also argued that speech is perceived on the basis of compatibility testing (and not by optimal matching), so that listeners will hear what they expect to hear as long as they do not notice any counter evidence in the signal.

Note: The terms "expressive" and "organic" (quality, information, properties), are much more adequate and should be substituted for "affective" and "personal" used in this paper.

