Prosodic Structure Prediction Based on Conditional Random Field Model

董远,周涛,董乘宇,王海拉
DOI: https://doi.org/10.3969/j.issn.1007-5321.2009.05.009
2009-01-01
Abstract:Prosodic structure prediction is an important component in mandarin text-to-speech(TTS) system.A prosodic structure prediction method is proposed,based on the conditional random field(CRF) algorithm.Prosodic word model and prosodic phrase model utilize CRF method for machine learning based on automatically segmented and tagged features and hierarchal prosodic structure information extracted from a large-scale manually labeled speech corpus.The approach achieves F-score of 90.67% in prosody word prediction and 80.05% in prosody phrase prediction,3.62% and 5.65% higher than that of max entropy(ME) algorithm based method.Experiment results show that the approach of CRF based method makes considerable improvement in prosodic structure prediction,and works well in real mandarin TTS system.
What problem does this paper attempt to address?