Dual Head-wise Coattention Network for Machine Comprehension with Multiple-Choice Questions
Zhuang Liu,Kaiyu Huang,Degen Huang,Jun Zhao
DOI: https://doi.org/10.1145/3340531.3412013
2020-01-01
Abstract:Multiple-choice Machine Comprehension (MC) is an important and challenging nature language processing (NLP) task where the machine is required to make the best answer from candidate answer set given particular passage and question. Existing approaches either only utilize the powerful pre-trained language models or only rely on an over complicated matching network that is design supposed to capture the relationship effectively among the triplet of passage, question and candidate answers. In this paper, we present a novel architecture, Dual Head-wise Coattention network (called DHC), which is a simple and efficient attention neural network designed to perform multiple-choice MC task. Our proposed DHC not only support a powerful pre-trained language model as encoder, but also models the MC relationship as attention mechanism straightforwardly, by head-wise matching and aggregating method on multiple layers, which better model relationships sufficiently between question and passage, and cooperate with large pre-trained language models more efficiently. To evaluate the performance, we test our proposed model on five challenging and well-known datasets for multiple-choice MC: RACE, DREAM, SemEval-2018 Task 11, OpenBookQA, and TOEFL. Extensive experimental results demonstrate that our proposal can achieve a significant increase in accuracy comparing existing models based on all five datasets, and it consistently outperforms all tested baselines including the state-of-the-arts techniques. More remarkably, our proposal is a pluggable and more flexible model, and it thus can be plugged into any pre-trained Language Models based on BERT. Ablation studies demonstrate its state-of-the-art performance and generalization.