MA-MRC: A Multi-answer Machine Reading Comprehension Dataset
Zhiang Yue,Jingping Liu,Cong Zhang,Chao Wang,Haiyun Jiang,Yue Zhang,Xianyang Tian,Zhedong Cen,Yanghua Xiao,Tong Ruan
DOI: https://doi.org/10.1145/3539618.3592015
2023-01-01
Abstract:Machine reading comprehension (MRC) is an essential task for many question-answering applications. However, existing MRC datasets mainly focus on data with single answer and overlook multiple answers, which are common in the real world. In this paper, we aim to construct an MRC dataset with both data of single answer and multiple answers. To achieve this purpose, we design a novel pipeline method: data collection, data cleaning, question generation and test set annotation. Based on these procedures, we construct a high-quality multi-answer MRC dataset (MA-MRC) with 129K question-answer-context samples. We implement a sequence of baselines and carry out extensive experiments on MA-MRC. According to the experimental results, MA-MRC is a challenging dataset, which can facilitate the future research on the multi-answer MRC task.