Multi-level Linguistic Knowledge Based Chinese Grapheme-to-Phoneme Conversion.

Yi Liu,Xiaojun Chen,Caixia Gong,Xihong Wu
DOI: https://doi.org/10.1007/978-3-642-42057-3_61
2013-01-01
Abstract:This paper proposes a novel method integrating multi-level linguistic knowledge for Chinese grapheme-to-phoneme(G2P) conversion. Pronunciation prediction of non-standard words(NSWs) and disambiguation of polyphonic characters are two important issues in Chinese grapheme-to-phoneme conversion. Considering effect of linguistic knowledge, multi-level linguistic cues, including word form, Part-of-Speech (POS), named entity, collocation and syntactic structure, are extracted under a unified syntactic parsing framework and integrated by maximum entropy approach to disambiguate polyphonic characters. Besides, the text normalization is incorporated in this framework to help predict pronunciation of non-standard words. Experiment results show that the proposed method can improve the performance from 95.64% to 99.23%. © 2013 Springer-Verlag Berlin Heidelberg.
What problem does this paper attempt to address?