Paralinguistic Variation in Speech

and How to Handle it in Speech Technology

Project director
Hartmut Traunmüller, Department of Linguistics, Stockholm University.

Project Period
1/7/1992 ...
With extramural financial support 1/7/1992 - 31/6/1993 and 1/7/1994 - 31/6/1996.

Funding agencies
HSFR, the Swedish Council for Research in the Humanities and Social Sciences,
NUTEK, the Swedish National Board for Technical and Industrial Development,
support from both within the frame of the Swedish language technology programme.

Abstract
The physical properties of speech sounds vary as a function of the speaker's age, sex, vocal effort, speech rate, and several additional factors which do not carry linguistic information but information about the speaker's person, state, and attitude. Present day products of speech technology cope with such variation only to a rather limited extent. This is one of the major factors that restrict their usefulness. As for speech synthesis, there is a vast range of applications in which it is sufficient to simulate one speaker and one way of speaking, but in some cases, e.g. when synthetic speech is to be used by vocally impeded persons, freedom in the choice of paralinguistic quality would be a clear advantage. As for automatic recognition, present day systems are not capable to distinguish the paralinguistic information in the speech signal from the linguistic information. That is why they have narrow restrictions in what concerns the vocabulary, the speaker, and his way of speaking. This is mainly due to lack of knowledge. Only when we have detailed knowledge about the effects of paralinguistic factors will it be possible to attempt automatic speech recognition with true tolerance of paralinguistic variation. It is the aim of the project to provide that type of knowledge. The project is aimed to contribute to the solution of the following practical tasks: 1) Synthesis of speech with desired type of paralinguistic quality. 2) Conversion of the paralinguistic quality of speech. 3) Automatic recognition of linguistic information in speech in spite of paralinguistic variation. 4) Automatic recognition of the paralinguistic information in speech.

Staff
Anders Eriksson, Department of Phonetics, Umeå University.
Anita Andersson, Department of Linguistics, Stockholm University.
Ingegerd Eklund, Department of Linguistics, Stockholm University.
Jessika Rundlöf, Department of Linguistics, Stockholm University.

Publications
Regular papers
  • Hartmut Traunmüller and Anders Eriksson (1995) "The perceptual evaluation of F0 excursions in speech as evidenced in liveliness estimations" J. Acoust. Soc. Am. 97: 1905 - 1915. (Abstract)
  • Ingegerd Eklund and Hartmut Traunmüller (1997) "Comparative study of male and female whispered and phonated versions of the long vowels of Swedish" Phonetica 54: 1 - 21. (Abstract)
    Conference contributions
  • Anders Eriksson (1993) "Liveliness in speech as a function of fundamental frequency (F0) variation and speech rate" In Ana Gariga-Trillo et al. (eds.) Fechner Day 93, Proceedings of the Ninth Annual Meeting of the International Society for Psychophysics, pp 77 - 82. Madrid: UNED.
  • Hartmut Traunmüller and Anders Eriksson (1994) "The size of F0-excursions in speech production and perception" Working Papers 43: 136 - 139 (Lund University, Department of Linguistics).
  • Anita Andersson, Anders Eriksson and Hartmut Traunmüller (1996) "Cries and whispers: Acoustic effects of variations in vocal effort" TMH-QPSR 2/1996: 127 - 130 (Royal Institute of Technology, Department of Speech, Music and Hearing, Stockholm). (Abstract)
  • Ingegerd Eklund and Hartmut Traunmüller (1996) "A comparative study of male and female whispered and phonated versions of the long vowels of Swedish" (Summary of paper listed above) TMH-QPSR 2/1996: 131 - 134 (Royal Institute of Technology, Department of Speech, Music and Hearing, Stockholm).
  • Hartmut Traunmüller (1997) "Perception of speaker sex, age, and vocal effort" Phonum 4: 183 - 186 (Umeå University, Department of Phonetics). (Abstract)
  • Hartmut Traunmüller and Anders Eriksson (1997) "A method of measuring formant frequencies at high fundamental frequencies" Proceedings of EuroSpeech'97, vol.1: 477 - 480. (Abstract)
    Studentuppsatser
  • Ingegerd Eklund, "En studie av de akustiska skillnaderna mellan fonerade och viskade långa svenska vokaler i isolering";, D-uppsats i fonetik med inriktning på talteknologi, Inst. f. lingvistik, Stockholms universitet.
  • Anita Andersson, "Viskningar och rop: segmentdurationer som funktion av talstyrka i viskat och fonerat tal", C-uppsats i fonetik, höstterminen 1996, Inst. f. lingvistik, Stockholms universitet.
  • Jessika Rundlöf, "Perceptuella ledtrådar vid auditiv bedömning av avståndet mellan talare och lyssnare", D-uppsats i fonetik, höstterminen 1996, Inst. f. lingvistik, Stockholms universitet.
  • Phonetics at Stockholm University | Phonetics at Umeå University
    Last updated in April 1997