Integrating Pinyin to Improve Spelling Errors Detection for Chinese Language

Peng Jin,Xingyuan Chen,Zhaoyi Guo,Pengyuan Liu
DOI: https://doi.org/10.1109/wi-iat.2014.71
2014-01-01
Abstract:Most Chinese texts are inputted with keyboard via two input methods: Pinyin and Wubi, especially by Pinyin input method. In this paper, this users' habitation is used to find the spelling errors automatically. We first train a Chinese character form n-gram language model on a large scale Chinese corpus in the traditional way. In order to improve this character based model, we transform the whole corpus into Pinyin to obtain Pinyin based language model. Fatherly, the tone is considered to get the third model. Integrating these three models, we improve the performance of checking spelling error system. Experimental results demonstrate the effeteness of our model.
What problem does this paper attempt to address?