Mongolian prosodic phrase prediction using suffix segmentation

Rui Liu,Feilong Bao,Guanglai Gao,Weihua Wang
DOI: https://doi.org/10.1109/IALP.2016.7875979
2016-01-01
Abstract:Accurate prosodic phrase prediction can improve the naturalness of speech synthesis. Predicting the prosodic phrase can be regarded as a sequence labeling problem and the Conditional Random Field (CRF) is typically used to solve it. Mongolian is an agglutinative language, in which massive words can be formed by concatenating these stems and suffixes. This character makes it difficult to build a Mongolian prosodic phrase predictions system, based on CRF, that has high performance. We introduce a new method that segments Mongolian word into stem and suffix as individual token. The proposed method integrates multiple features according to the characteristics of Mongolian word formation. We conduct the contrast experiment by selecting the following features: word, multi-level Part-of-Speech (POS), multi-level lexical for suffix and the existence for suffix. The experimental results show that our method has significantly enhanced the performance of the Mongolian prosodic phrase prediction system through comparing with the conventional method that treats Mongolian word as token directly. The word feature, level one lexical for suffix feature and existence for suffix feature are effective. The best result is measured by Fl-measure as 82.49%.
What problem does this paper attempt to address?