A novel multi-domain machine reading comprehension model with domain interference mitigation
Chulun Zhou,Zhihao Wang,Shaojie He,Haiying Zhang,Jinsong Su
DOI: https://doi.org/10.1016/j.neucom.2022.05.102
IF: 6
2022-08-21
Neurocomputing
Abstract:Machine reading comprehension (MRC), as an important task in natural language processing (NLP), is to automatically answer the question after reading a passage. In this aspect, dominant studies mainly focus on domain-specific models. However, domain-specific models trained only on single domain data often cannot achieve satisfactory performance. Although using data of other domains can bring improvement to some extent, building MRC models specific to each domain also makes deployment more difficult in practice. In this paper, we propose a multi-domain MRC model based on knowledge distillation (KD) with domain interference mitigation. Specifically, we employ KD to train a joint model by simultaneously using the multi-domain data and the output distributions of all domain-specific models. In this way, our joint model can better exploit multi-domain data while enabling simpler deployment at the same time. Moreover, to deal with the gradient conflict caused by using data of different domains, we resort to measuring domain-level gradient similarity, based on which an improved PCGrad (short for projecting conflicting gradients) algorithm with adaptive learning rate is proposed. The algorithm mitigates domain interference to improve our joint model across domains. Experimental results and in-depth analysis demonstrate the effectiveness of our joint model and mitigating domain interference further improves the overall performance of our model on a set of benchmark datasets.
computer science, artificial intelligence