A Phrase Combination Approach to Patent SMT

Junguo Zhu,Muyun Yang,Tiejun Zhao,Sheng Li,Haoliang Qi
DOI: https://doi.org/10.2991/jcis.2008.99
2008-01-01
Abstract:This paper presents a phrase combination approach to patent SMT (Statistical Machine Translation) for Japanese to English.To minimize the segmentation problems caused by the rich OOV (out-ofvocabulary) words in the patent texts, the character based translation phrases are first introduced to avoid the segmentation errors in translation modeling.Then the word based translation phrases, which are established to utilize the dependent word level information, are combined with character translation table by linearly integrating their probability.Our experiments on NTCIR corpus indicate that the proposed method significantly out-performed the originally word based approach.
What problem does this paper attempt to address?