Modeling prosody pattern of Chinese expressive speech and its application in personalized speech conversion

Zhang Zhang,Zhiyong Wu,Jia Jia,Lianhong Cai
2012-01-01
Abstract:This paper proposes an approach for modeling prosody patterns of acoustic features of Chinese expressive speech. In a Chinese multi-syllabic prosodic word, a syllable is identified as the core syllable based on the observation that speaker usually puts more emphasis on such syllable. The variations of the acoustic features migrating from neutral to expressive speech are then analyzed for both the core and non-core syllables. It is found that the acoustic variations of the core syllable are the most significant; the variations of the non-core syllables are influenced by the core syllable; such influence decreases while the non-core syllable moves farther from the core syllable. A double-layer perturbation model is then proposed to model such prosody patterns, which is further applied to generate personalized prosody patterns for personalized speech conversion. Experimental results indicate that our model can catch …
What problem does this paper attempt to address?