Assessing accuracy of resonances obtained with reassigned spectrograms from the “ground truth” of physical vocal tract models

Christine H. Shadle,Sean A. Fulop,Wei-Rong Chen,D. H. Whalen
DOI: https://doi.org/10.1121/10.0024548
2024-02-01
The Journal of the Acoustical Society of America
Abstract:The reassigned spectrogram (RS) has emerged as the most accurate way to infer vocal tract resonances from the acoustic signal [Shadle, Nam, and Whalen (2016). “Comparing measurement errors for formants in synthetic and natural vowels,” J. Acoust. Soc. Am. 139(2), 713–727]. To date, validating its accuracy has depended on formant synthesis for ground truth values of these resonances. Synthesis is easily controlled, but it has many intrinsic assumptions that do not necessarily accurately realize the acoustics in the way that physical resonances would. Here, we show that physical models of the vocal tract with derivable resonance values allow a separate approach to the ground truth, with a different range of limitations. Our three-dimensional printed vocal tract models were excited by white noise, allowing an accurate determination of the resonance frequencies. Then, sources with a range of fundamental frequencies were implemented, allowing a direct assessment of whether RS avoided the systematic bias towards the nearest strong harmonic to which other analysis techniques are prone. RS was indeed accurate at fundamental frequencies up to 300 Hz; above that, accuracy was somewhat reduced. Future directions include testing mechanical models with the dimensions of children's vocal tracts and making RS more broadly useful by automating the detection of resonances.
acoustics,audiology & speech-language pathology
What problem does this paper attempt to address?