Prosodic Word Boundaries Prediction for Mandarin Text-to-Speech

YanQiu Shao,JiQing Han,Ting Liu,YongZhen Zhao
2004-01-01
Abstract:In Mandarin speech, the Prosodic Word (PW) is the basic rhythmic unit instead of Lexical Word (LW), and the naturalness of TTS will be directly influenced by the segmentation of PW. Most of the PWs are the combination of some LWs. In this paper, three models, i.e. a directed acyclic graph (DAG) model, segmentation model and Markov Model (MM) combined with Transformation-Based Error Driven (TBED) learning algorithm are designed to combine lexical words into prosodic words. Considering some long LWs should be broken into two or more PWs, a long word break model is also applied to those LWs. Experimental results show that MM combined with TBED plus a long word break model is the best one among the three methods, and 93.00% precision and 93.23% recall are achieved.
What problem does this paper attempt to address?