Mongolian Morphological Segmentation by Phrase Based Statistical Machine Translation Method

Wen LI,Miao LI,Qing LIANG,Hai ZHU,Yulong YING,Wudabala
DOI: https://doi.org/10.3969/j.issn.1003-0077.2011.04.024
2011-01-01
Abstract:This paper presents a Mongolian morphological segmentation approach by statistical machine translation method and minimum constituent-context cost model.The phrase based statistical machine translation and minimum constituent-context cost model are adopted to deal with in-vocabulary and out-of-vocabulary morphological segmentation,respectively.Three features commonly used in phrase based statistical machine translation were selected for the segmentation,i.e.the phrase translation probability,the lexical translation probability and the language model score.The uni-gram morpheme context and N-gram suffix context are considered in the minimum constituent-context cost model.Experiments show that the precision of the morphological segmentation system achieves 96.94%,and the translation results of the statistical machine translation system is improved obviously.
What problem does this paper attempt to address?