Cross-Lingual Leveled Reading Based on Language-Invariant Features

Simin Rao,Hua Zheng,Sujian Li
DOI: https://doi.org/10.18653/v1/2021.findings-emnlp.227
2021-01-01
Abstract:Leveled reading (LR) aims to automatically classify texts according to different reading capabilities and provide appropriate reading materials to readers. However, most state-of-theart LR methods rely on the availability of copious annotated resources, which prevents their adaptation to low-resource languages like Chinese. In our work, to tackle Chinese LR, we explore to perform different language transfer methods on English-Chinese LR. Specifically, we focus on adversarial training and cross-lingual pre-training method to transfer the LR knowledge learned from annotated data in the rich-resource English language to Chinese. For evaluation, we introduce the agebased standard to align datasets with different leveling standards, and conduct experiments in both zero-shot and few-shot settings. Experiments show that the cross-lingual pre-training method can capture language-invariant features more effectively than adversarial training. We also conduct analysis to propose further improvement in cross-lingual LR.
What problem does this paper attempt to address?