Research and Implementation of Sentence Error Correction Method in Thangka Field

Tiejun Wang,Sujie Cheng,Yu Wang
2022-01-01
Abstract:Text error correction is one of the research fields of natural language processing, which is applied in search engine, intelligent question answering, associative input and so on. At present, text error correction is mostly in the general domain, and there are few researches on the error correction of Thangka statements. This paper proposes a method of error correction of Thangka statements. Firstly, Thangka data is pre-processed to complete the training of language model and improve the efficiency and accuracy of text recognition. Then detect the suspected wrong position of the statement, and use domain confusion set, word granularity n-gram and word granularity n-gram to detect the wrong word. Finally, the candidate set of editing distance and phonetic/shape near word recall error correction were used, and the candidate results were sorted by the degree of confusion. The method proposed in this paper has a good application value in the field of Thangka error correction.
What problem does this paper attempt to address?