Automatic Emphasis Labeling for Emotional Speech by Measuring Prosody Generation Error

Jun Xu,Lian-Hong Cai
DOI: https://doi.org/10.1007/978-3-642-04070-2_20
2009-01-01
Abstract:Emotion helps human to express their feelings and intentions clearly. And the emphasis labels of speeches are the key of speech emotion analysis and synthesis. In order to label the emotion emphasis of speech samples from a corpus with only phonetic and prosodic information, this paper introduces an automatic labeling algorithm by measuring the prosody generation error (PGE) of the result from a statistical synthesizer. Classification and Regression Tree (CART) and Maximum Entropy (ME) modeling are adopted for automatically labeling. Experiment shows that both models are helpful for labeling.
What problem does this paper attempt to address?