Input Chinese Sentences Using Digits

Fang Zheng,Jian Wu,Wenhu Wu
DOI: https://doi.org/10.21437/icslp.2000-494
2000-01-01
Abstract:Chinese character input is always a key issue in a variety of Chinese based applications especially when only a small number keypad is available. Though many kinds of Chinese character encoding schemes are proposed according to Chinese character characteristics, such as the shape, they are not straightforward and will take users a long time to learn. An easy way is to input via Chinese pinyins. In this paper, we establish the mapping between digit string and pinyin as well as the mapping between the pinyin string and the word, referred to as the Syllable-Digit search Tree (SDT) and the Word-Syllable search Tree (WST) respectively. By using these two search trees as well as the word N-gram language model and the syllable-synchronous network search (SSNS) algorithm, any digit string can be easily converted into Chinese word sequence or sentence. Without users’ selecting from candidates, the character error rate (CER) of digit-to-character (D/C) conversion is 6.6% across a test text consisting 22,083 characters.
What problem does this paper attempt to address?