Manipulations in speaker age and sex

The appended transformations in speaker age and sex have been obtained by analyzing a natural utterance using linear predictive coding and by re-synthesis after recalculation of the parameter values descriptive of the speech signal [1]. The recalculations of F0 and the formant frequencies were based on the values listed in [2], Table 2, except for an error in this table: The value of kF0 in the transformation from women to girl, 12-14 years, should be 1.03 (instead of 1.27). Speech rate has also been modified [3]. Q-values have been conserved.

Manipulation of a speaker's age
5 - 12 - 14 - 21 years

Age rating experiments with speech manipulated in this way [4] show some bias towards the original age of the speaker. This can be attributed to the conserved 'verbal maturity' in the transformed versions.

In order to

transform phonated into whispered speech

a noise source with the right spectrum has to be substituted for the buzz source, and the formant frequencies have to be increased [5].

It is also possible to modify the speaker in sex.

female - transformed into male

male - transformed into female

Not quite convincing? With a better knowledge of the female - male differences in the acoustic properties of speech, it could probably be done in a more convincing way.

Now you may listen to a table conversation by a synthetic Swedish family.

Ett bordssamtal

The parts

The adult woman . . . . . Eva Öberg
The adolescent girl . . . . . Eva Öberg
The kindergarten girl . . . Eva Öberg
The adult man . . . . . . . Eva Öberg


