Comparative Analysis of Language Models for Linguistic Examination of Ancient Chinese Classics: A Case Study of Zuozhuan Corpus.
Yiqin Zhang,Sanhong Deng,Qi Zhang,Dongbo Wang,Hongcun Gong
DOI: https://doi.org/10.1109/IALP61005.2023.10337146
2023-01-01
Abstract:Exploring the comparative analysis of translation styles across languages holds great significance for capturing the essence of ancient Chinese classics. This article presents a comprehensive analysis of language models for linguistic examination of ancient Chinese classics, using the cross-language Zuozhuan corpus as a focal point. We utilize the capabilities of five pre-trained language models to compare their effectiveness with the deep learning model Bi-LSTM-CRF in the areas of word segmentation and parts-of-speech tagging. Optimal training results obtained from the models facilitated the completion of word segmentation and parts-of-speech tagging across the entire corpus of ancient Chinese classics. Building on these advancements, this research delves into a meticulous lexical-level scrutiny of the linguistic style evident in ancient Chinese classics and their corresponding English translations. In contrast to the original Chinese text, the contemporary Chinese translation exhibits greater semantic clarity, manifesting a relatively singular phrase function and a heightened diversity in vocabulary combinations. Conversely, the English translation demonstrates a tendency toward simplification. The analysis encompasses facets such as parts of speech distribution, word length variation, lexical richness, and textual density, providing unprecedented insights into the cross-linguistic nuances of these venerable literary works.