AUTOMATIC EXTRACTION OF CHINESE-ENGLISH PHRASE TRANSLATION PAIRS
Liu Ying,Tie Zheng,Yu Chang
DOI: https://doi.org/10.3969/j.issn.1000-386X.2012.07.022
2012-01-01
Abstract:The thesis studies on how to extract bilingual phrase translation pairs from parallel corpus library.Firstly,it uses statistical model normalized expectation(NE) to extract Chinese phrases from Chinese patent corpus library.The extracted phrases are filtrated by statistical and linguistic knowledge so that the precision rate of filtered Chinese phrases is high.Secondly,it uses Giza+ +,a word alignment tool,to extract words from Chinese-English parallel corpus library and align them;when words alignment completes,it uses Moses,an opensource tool to extract Chinese-English phrases and align them.Based on the interaction of phrasal alignment and extracted high-quality Chinese phrases,it extracts candidate Chinese-English mutual translation source language phrases.Thirdly,it uses words that have been stopped using,LLR and context entropy(CE) to filtrate English phrases translation.Experimental results show that,after extraction,the precision rate of Chinese phrases is 97.6% while that of Chinese-English phrase pairs is 92.4%.