Real Context Model for Tone Recognition in Mandarin Conversational Telephone Speech
Zhaojie Liu,Jian Shao,Pengyuan Zhang,Qingwei Zhao,Yonghong Yan,Ji Feng
DOI: https://doi.org/10.1109/icnc.2007.595
2007-01-01
Abstract:This paper presents an approach to tone recognition in mandarin conversational telephone speech (CTS) based on a real context model. The real context model is proposed as a new concept designed with special consideration on the fact that mandarin CTS is characterized by complicated tone behaviors due to physiological articulation. As pitch is a supra-segmental feature, current tone's pitch value is influenced by its context especially in CTS for its fast speaking rate. A real context model covers not only the current tone but also the relative pitch level of pre-tone. Then we cluster the real context annotated training data into a few subsets to generate a more refined tone model. Gaussian Mixture Model (GMM) is used for the tone modeling. In addition, a kind of similarity measurements to compute the distance between two tones is employed, which should reveal the similarity of their pitch contour shapes and also include their different pitch height. All experiments are based on the mandarin CTS database, Train04. Our methods can improve tone recognition accuracy 4.7%.