Analysis and Synthesis of Continuous Voice Pitch Contour for Improving Chinese Synthetic Speech Naturalness

田岚,陆小珊,杨霓清
DOI: https://doi.org/10.3969/j.issn.1672-3961.2003.04.017
2003-01-01
Abstract:The continuous speech Fo contour plays key role for the naturalness and emotion in text to speech conversion system. Based on statistics method and clustering at the sequence location of each syllable, we systematically analyzed a large number of Chinese continuous speech pitch contours. As a consequence, a hierarchical prosody analysis and synthesis model is introduced, in which Mandarin characteristics are fully taken into account, introducing all tone patterns and phrase dynamic trend, and setting relative control command parameters and sandhi rules. The model quantitatively describes the relationship between prosody features and Chinese multi layer linguistic information. The emulating tests for some typical natural utterances show that synthetic Fo contours have good correspondences with the objective samples and that the model is expected to improve the naturalness of TTS synthetic speech evidently.
What problem does this paper attempt to address?