Development of Translation Database based on Chinese-English parallel corpora

He Lianzhen
DOI: https://doi.org/10.3969/j.issn.1003-6105.2007.02.009
2007-01-01
Abstract:This paper reports on a Sino-British joint project that aims to create a Chinese-English Translation Database listing English translation units together with their Chinese equivalents and vice versa.For this purpose,the bilingual texts were first aligned at sentence level and then the Chinese and English texts were annotated respectively.From the aligned texts both Chinese and English multi-word units were identified separately from each corpus.After this,the computer software sought to establish the correspondence between Chinese and English translation units,thus creating a list of bilingual Translation Equivalent Pairs,which were then manually validated and input to the Translation Database.The above approach is content-oriented and characterized by unambiguous words or multi-word units as basic translation units and a collection of bilingual translation units,i.e.translation units in one language and their translation equivalents in the target language.This approach has been shown to help improve the efficiency and precision of machine translation.
What problem does this paper attempt to address?