Construction and Processing of a Parallel Corpus for Tang Poetry and Song Lyrics

Lei Wang,Houfeng Wang
DOI: https://doi.org/10.1109/mlise62164.2024.10674185
2024-01-01
Abstract:Since 1980s, machine translation technology has witnessed a rapid development. Aligned parallel corpus, either as a resource for computer-assisted translation or as a training corpus for statistical machine translation, exhibits great potential in application. The method of EBMT proposed by Nagao Makoto provides a chance for automatic translation of poetry, which has been considered as a thorny problem in translation. This paper introduces the construction and processing of a Parallel Corpus for Tang Poetry and Song Lyrics (PCTS). As is known to all, bilingual parallel corpora serve as fundamental resources for language teaching and research. With a concordancer, language teachers and learners will gain a better understanding of a certain line in its various contexts, in which the rhetorical device and rhyming pattern can also be obtained.
What problem does this paper attempt to address?