Aligning a Parallel English-Chinese Corpus Statistically with Lexical Criteria

Dekai Wu
DOI: https://doi.org/10.48550/arXiv.cmp-lg/9406007
1994-06-02
Computation and Language
Abstract:We describe our experience with automatic alignment of sentences in parallel English-Chinese texts. Our report concerns three related topics: (1) progress on the HKUST English-Chinese Parallel Bilingual Corpus; (2) experiments addressing the applicability of Gale & Church's length-based statistical method to the task of alignment involving a non-Indo-European language; and (3) an improved statistical method that also incorporates domain-specific lexical cues.
What problem does this paper attempt to address?