Effective Strategies for Low-Resource Reading Comprehension

Yimin Jing,Deyi Xiong
DOI: https://doi.org/10.1109/ialp51396.2020.9310502
2020-01-01
Abstract:Machine reading comprehension (MRC) has recently reached human-level accuracy on resource-rich languages (e.g. English). However, for low-resource MRC, there is a huge gap between machine and human performance due to limited annotated data. To narrow this gap, we investigate three strategies, namely data augmentation via translation, multilingual training and cross-lingual fine-tuning, to improve low-resource MRC via knowledge transfer. Experiments on a small Chinese MRC dataset (CMRC2018), demonstrate that our strategies are capable of leveraging knowledge from the English SQuAD dataset. Furthermore, the combination of the three strategies achieves significant improvements on the DRCD (the Delta Reading Comprehension Dataset).
What problem does this paper attempt to address?