Design of Speech Corpus for Mandarin Text to Speech

J. Tao,Fangzhou Liu,Meng Zhang,Huibin Jia
2008-01-01
Abstract:This paper introduces the CASIA Mandarin corpus designed for Mandarin speech synthesis research. It has been carefully recorded by a professional female speaker under studio conditions. The corpus contains 5000 phonetic context balanced sentences with about 7 hours. The text transcription with word boundaries, POS tags and pronunciation are also involved. The final corpus has been delivered to Blizzard Challenge 2008 as the common corpus for Mandarin speech synthesis evaluation among all participants.
What problem does this paper attempt to address?