3 Method Dictionary Matching End Input Text OOV Correction Correct Words OOV in Text Multi-tokens Polyphone Identification Start Shape Character Conversion

Min Lu,Feilong Bao,Guanglai Gao
2019-01-01
Abstract:Spelling errors in the classical Mongolian text are mainly caused by misuse of polyphonic letters which present the same shape in the certain position of the word. About half to three-quarters of the classical Mongolian words are misspellings which have the correct appearances but wrong codes. In this paper, we code the Mongolian words by glyph codes to map the words to their shapes one-to-one. In addition, we also proposed the correction of out-of-vocabulary words (OOV) based on the Evolved Transformer by formalizing the correction task as a translation from misspellings to target spellings. The experimental results show that this approach achieves the new state-of-the-art performance.
What problem does this paper attempt to address?