Prosody Model for Mandarin Text-to-Speech System

WANG Zhi-wei,SHAO Yan-qiu,ZHAO Yong-zhen,LIU Ting
DOI: https://doi.org/10.3969/j.issn.1001-3695.2006.06.024
2006-01-01
Abstract:Prosody model is a essential part in text-to-speech system.It plays an important role in naturalness of synthesized speech.This paper integrates artificial neural networks with unit selection in prosody model,and applies them to the generation of duration and pitch.It presents a three-layer back-propagation neural network in duration model,and an algorithm based on minimizing distance summation of a whole utterance in pitch model.
What problem does this paper attempt to address?