Paralinguistic Variation in Speech
and How to Handle it in Speech Technology

- Project director
- Hartmut Traunmüller, Department of Linguistics,
Stockholm University.
- Project Period
- 1/7/1992 ...
- With extramural financial support 1/7/1992 - 31/6/1993 and 1/7/1994 - 31/6/1996.
- Funding agencies
- HSFR, the Swedish Council for Research in the Humanities and Social Sciences,
- NUTEK, the Swedish National Board for Technical and Industrial Development,
- support from both within the frame of the Swedish language technology programme.
- Abstract
- The physical properties of speech sounds vary as a function of the
speaker's age, sex, vocal effort, speech rate, and several additional factors
which do not carry linguistic information but information about the speaker's
person, state, and attitude. Present day products of speech technology
cope with such variation only to a rather limited extent. This is one of
the major factors that restrict their usefulness. As for speech synthesis,
there is a vast range of applications in which it is sufficient to simulate
one speaker and one way of speaking, but in some cases, e.g. when synthetic
speech is to be used by vocally impeded persons, freedom in the choice
of paralinguistic quality would be a clear advantage. As for automatic
recognition, present day systems are not capable to distinguish the paralinguistic
information in the speech signal from the linguistic information. That
is why they have narrow restrictions in what concerns the vocabulary, the
speaker, and his way of speaking. This is mainly due to lack of knowledge.
Only when we have detailed knowledge about the effects of paralinguistic
factors will it be possible to attempt automatic speech recognition with
true tolerance of paralinguistic variation. It is the aim of the project
to provide that type of knowledge. The project is aimed to contribute to
the solution of the following practical tasks: 1) Synthesis of speech with
desired type of paralinguistic quality. 2) Conversion of the paralinguistic
quality of speech. 3) Automatic recognition of linguistic information in
speech in spite of paralinguistic variation. 4) Automatic recognition of
the paralinguistic information in speech.
- Staff
- Anders Eriksson, Department
of Phonetics, Umeå University.
- Anita Andersson, Department of Linguistics, Stockholm University.
- Ingegerd Eklund, Department of Linguistics, Stockholm University.
- Jessika Rundlöf, Department of Linguistics, Stockholm University.
- Publications
Regular papers
- Hartmut Traunmüller and Anders Eriksson (1995) "The perceptual
evaluation of F0 excursions in speech as evidenced
in liveliness estimations" J. Acoust. Soc. Am. 97:
1905 - 1915. (Abstract)
Ingegerd Eklund and Hartmut Traunmüller (1997) "Comparative
study of male and female whispered and phonated versions of the long vowels
of Swedish" Phonetica 54: 1 - 21. (Abstract)
Conference contributions
Anders Eriksson (1993) "Liveliness in speech as a function of fundamental frequency (F0) variation and speech rate" In Ana Gariga-Trillo et al. (eds.) Fechner Day 93, Proceedings of the Ninth Annual Meeting of the International Society for Psychophysics, pp 77 - 82. Madrid: UNED.
Hartmut Traunmüller and Anders Eriksson (1994) "The size of F0-excursions in speech production and perception" Working Papers 43: 136 - 139 (Lund University, Department of Linguistics).
Anita Andersson, Anders Eriksson and Hartmut Traunmüller (1996)
"Cries and whispers: Acoustic effects of variations in vocal effort" TMH-QPSR
2/1996: 127 - 130 (Royal Institute of Technology, Department of Speech,
Music and Hearing, Stockholm). (Abstract)
Ingegerd Eklund and Hartmut Traunmüller (1996) "A comparative
study of male and female whispered and phonated versions of the long vowels
of Swedish" (Summary of paper listed above) TMH-QPSR 2/1996:
131 - 134 (Royal Institute of Technology, Department of Speech, Music and
Hearing, Stockholm).
Hartmut Traunmüller (1997) "Perception of speaker sex, age, and vocal effort" Phonum 4: 183 - 186 (Umeå University, Department of Phonetics). (Abstract)
Hartmut Traunmüller and Anders Eriksson (1997) "A method of measuring formant frequencies at high fundamental frequencies" Proceedings of EuroSpeech'97, vol.1: 477 - 480. (Abstract)
Studentuppsatser
Ingegerd Eklund, "En studie av de akustiska skillnaderna mellan
fonerade och viskade långa svenska vokaler i isolering";, D-uppsats
i fonetik med inriktning på talteknologi, Inst. f. lingvistik, Stockholms
universitet.
Anita Andersson, "Viskningar och rop: segmentdurationer som funktion
av talstyrka i viskat och fonerat tal", C-uppsats i fonetik, höstterminen
1996, Inst. f. lingvistik, Stockholms universitet.
Jessika Rundlöf, "Perceptuella ledtrådar vid auditiv
bedömning av avståndet mellan talare och lyssnare", D-uppsats
i fonetik, höstterminen 1996, Inst. f. lingvistik, Stockholms universitet.
Phonetics at Stockholm
University | Phonetics
at Umeå University
Last updated in April 1997