Improving syntactic rule extraction through deleting spurious links with translation span alignment.

Jingbo Zhu,Qiang Li,Tong Xiao
DOI: https://doi.org/10.1017/S1351324913000260
IF: 1.841
2015-01-01
Natural Language Engineering
Abstract:Most statistical machine translation systems typically rely on word alignments to extract translation rules. This approach would suffer from a practical problem that even one spurious word alignment link can prevent some desirable translation rules from being extracted. To address this issue, this paper presents two approaches, referred to as sub-tree alignment and phrase-based forced decoding methods, to automatically learn translation span alignments from parallel data. Then, we improve the translation rule extraction by deleting spurious links and inserting new links based on bilingual translation span correspondences. Some comparison experiments are designed to demonstrate the effectiveness of the proposed approaches.
What problem does this paper attempt to address?