A Hybrid Model for Word Alignment with Bilingual Corpus

Chen Liang,Xu Jinan,Zhang Yujie
DOI: https://doi.org/10.1109/csae.2012.6272917
2012-01-01
Abstract:Word alignment is a key research in text information processing. In this paper, we propose a hybrid model of word alignment by combining IBM 5-models, Word Entropy model and Support Information model organically[1]. The sub-models of the Support Information model includes: Minimum Intersection model and Minimum Difference model. Researches indicate that IBM model could implement word alignment with high recall value but low precision result, while the Support Information model can bring low recall value but high precision result and the Word Entropy model could effectively reduce the affected noises from other words. So we think up an idea that combining and utilizing the advantages of these models. Experimental result of our hybrid model obtains 89.19% of the f-measure, 88.74% of the recall and 89.66% of the precision.
What problem does this paper attempt to address?