How far are vowel formants from computed vocal tract resonances?

Daniel Aalto,Antti Huhtala,Atle Kivelä,Jarmo Malinen,Pertti Palo,Jani Saunavaara,Martti Vainio
DOI: https://doi.org/10.48550/arXiv.1208.5963
2012-08-29
Dynamical Systems
Abstract:We compare numerically computed resonances of the human vocal tract with formants that have been extracted from speech during vowel pronunciation. The geometry of the vocal tract has been obtained by MRI from a male subject, and the corresponding speech has been recorded simultaneously. The resonances are computed by solving the Helmholtz partial differential equation with the Finite Element Method (FEM). Despite a rudimentary exterior space acoustics model, i.e., the Dirichlet boundary condition at the mouth opening, the computed resonance structure differs from the measured formant structure by $\approx$ 0.7 semitones for [i] and [u] having small mouth opening area, and by $\approx$ 3 semitones for vowels [a] and [ae] that have a larger mouth opening. The contribution of the possibly open velar port has not been taken into considaration at all which adds the discrepancy for [a] in the present data set. We conclude that by improving the exterior space model and properly treating the velar port opening, it is possible to computationally attain four lowest vowel formants with an error less than a semitone. The corresponding wave equation model on MRI-produced vocal tract geometries is expected to have a comparable accuracy.
What problem does this paper attempt to address?