Abstract: Span-extraction reading comprehension models have made tremendous advances enabled by the availability of large-scale, high-quality training datasets. Despite such rapid progress and widespread application, extractive reading comprehension datasets in languages other than English remain scarce, and creating such a sufficient amount of training data for each language is costly and even impossible. An alternative to creating large-scale high-quality monolingual span-extraction training datasets is to develop multilingual modeling approaches and systems which can transfer to the target language without requiring training data in that language. In this paper, in order to solve the scarce availability of extractive reading comprehension training data in the target language, we propose a multilingual extractive reading comprehension approach called XLRC by simultaneously modeling the existing extractive reading comprehension training data in a multilingual environment using self-adaptive attention and multilingual attention. Specifically, we firstly construct multilingual parallel corpora by translating the existing extractive reading comprehension datasets (i.e., CMRC 2018) from the target language (i.e., Chinese) into different language families (i.e., English). Secondly, to enhance the final target representation, we adopt self-adaptive attention (SAA) to combine self-attention and inter-attention to extract the semantic relations from each pair of the target and source languages. Furthermore, we propose multilingual attention (MLA) to learn the rich knowledge from various language families. Experimental results show that our model outperforms the state-of-the-art baseline (i.e., RoBERTa_Large) on the CMRC 2018 task, which demonstrate the effectiveness of our proposed multi-lingual modeling approach and show the potentials in multilingual NLP tasks.

Cross-Lingual Leveled Reading Based on Language-Invariant Features

Domain-specific Cross-Language Relevant Question Retrieval.

Improving Cross-Lingual Reading Comprehension with Self-Training

Cross-lingual Machine Reading Comprehension with Language Branch Knowledge Distillation

Improving Low-resource Reading Comprehension via Cross-lingual Transposition Rethinking

From Good to Best: Two-Stage Training for Cross-Lingual Machine Reading Comprehension

Cross-Lingual Adaptation using Structural Correspondence Learning

A Simple and Effective Method to Improve Zero-Shot Cross-Lingual Transfer Learning.

A Multilingual Modeling Method for Span-Extraction Reading Comprehension

Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model

Effective Transfer Learning for Low-Resource Natural Language Understanding

XCMRC: Evaluating Cross-lingual Machine Reading Comprehension

Cross-Lingual Training with Dense Retrieval for Document Retrieval

Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling

Zero-shot Cross-lingual Conversational Semantic Role Labeling

A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual LLMs

Choosing Transfer Languages for Cross-Lingual Learning

Investigating Transfer Learning in Multilingual Pre-trained Language Models through Chinese Natural Language Inference

Cross-Lingual Named Entity Recognition Based on Attention and Adversarial Training

Cross-Lingual Transfer Robustness to Lower-Resource Languages on Adversarial Datasets

Attention-Informed Mixed-Language Training for Zero-Shot Cross-Lingual Task-Oriented Dialogue Systems