Polynomial Regression Model for Duration Prediction in Mandarin.

孙璐,胡郁,王仁华
DOI: https://doi.org/10.21437/interspeech.2004-292
2005-01-01
Abstract:Duration information is an essential part of speech prosody, and plays a critical role in improving the naturalness and understandability of synthesized speech. Duration modeling is to establish a mapping relationship between the prosodic environment and the final duration engendered in natural speech. In this paper, we first study the effect of prosodic features on segmental duration by introducing a statistical concept—eta squared, then choose more forceful prosodic features and design an algorithm to quantify the interaction among them, and finally bring forward the method of determining the duration model using a polynomial equation and obtain the coefficients through non-linear regression. Our research work indicates that 5 or 6 prosodic features might by and large assist a close and accurate mapping between prosodic environment and perceived duration. Compared to Wagon tree method, this method has undeniable merits.
What problem does this paper attempt to address?