Abstract:Machine Reading Comprehension (MRC) has achieved impressive answer inference performance in recent years but rarely considers the trustworthiness and reliability of the deployed systems. However, it is crucial to estimate the predictive uncertainty in real-world applications to measure how likely the prediction is wrong. Hence it is possible to abstain from the uncertain prediction with low confidence and build a trustworthy system. Prior studies use post-processing ways to measure the predictive uncertainty, such as employing heuristic softmax probability or training a calibrator on top of a trained MRC model. However, they only calibrate the confidence without considering the domain adaptation relationship. To handle the limitations, this paper presents TrustMRC, a non-postprocessing trustworthy MRC system that leverages (1) conditional calibration strategy to get reliable uncertainty, and (2) conditional adversarial learning strategy to learn transfer representations under domain shift setting. On the one hand, to estimate the predictive uncertainty, a conditional calibration module is proposed to predict whether the output of the answer prediction module is correct, and it is combined with an additional ECE constraint to restrict the confidence more reliable. On the other hand, for domain shift, TrustMRC designs a conditional adversarial learning strategy to learn transfer representations through a domain discriminator with uncertainty constraints, which takes both input and uncertainty alignment into account. Besides, TrustMRC is a non-postprocessing model that completes the answer prediction and uncertainty prediction in an end-to-end framework, so that these two sub-tasks can benefit from each other via multi-task learning. Instead of traditional EM and F1 metrics, EM-coverage and F1-coverage curves are used, for the trustworthiness-aware MRC evaluation. The experimental results on SQuAD 1.1, Natural Questions, and NewsQA datasets indicate that TrustMRC can make reliable predictions under domain shift settings.

ExpMRC: Explainability Evaluation for Machine Reading Comprehension

PALRACE: Reading Comprehension Dataset with Human Data and Labeled Rationales

Multilingual Multi-Aspect Explainability Analyses on Machine Reading Comprehension Models

Unsupervised Explanation Generation for Machine Reading Comprehension

Teaching Machines to Read, Answer and Explain

Feeding What You Need by Understanding What You Learned

Understanding Attention in Machine Reading Comprehension

Evidence Sentence Extraction for Machine Reading Comprehension

An In-depth Interactive and Visualized Platform for Evaluating and Analyzing MRC Models

REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training

From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Model to Pre-trained Machine Reader

A Sentence Quality Evaluation Framework for Machine Reading Comprehension Incorporating Pre-trained Language Model.

A Self-Training Method for Machine Reading Comprehension with Soft Evidence Extraction

mPMR: A Multilingual Pre-trained Machine Reader at Scale

Verification mechanism to obtain an elaborate answer span in machine reading comprehension

Trustworthy machine reading comprehension with conditional adversarial calibration

Teaching Machine Comprehension with Compositional Explanations

An Interpretable Model Using Evidence Information for Multi-Hop Question Answering Over Long Texts

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

SciMRC: Multi-perspective Scientific Machine Reading Comprehension