Phrasal Syntactic Category Sequence Model for Phrase-Based MT

Hailong Cao,Eiichiro Sumita,Tiejun Zhao,Sheng Li
DOI: https://doi.org/10.1007/978-3-642-28601-8_5
2012-01-01
Abstract:Incorporating target syntax into phrase-based machine translation (PBMT) can generate syntactically well-formed translations. We propose a novel phrasal syntactic category sequence (PSCS) model which allows a PBMT decoder to prefer more grammatical translations. We parse all the sentences on the target side of the bilingual training corpus. In the standard phrase pair extraction procedure, we assign a syntactic category to each phrase pair and build a PSCS model from the parallel training data. Then, we log linearly incorporate the PSCS model into a standard PBMT system. Our method is very simple and yields a 0.7 BLEU point improvement when compared to the baseline PBMT system.
What problem does this paper attempt to address?