Accurate, Diverse and Multiple Distractor Generation with Mixture of Experts.

Fanyi Qu,Che Wang,Yunfang Wu
DOI: https://doi.org/10.1007/978-3-031-44693-1_59
2023-01-01
Abstract:Given the background passage, question and answer, Distractor Generation (DG) aims to generate several incorrect options to confuse readers, which is an essential composition to build multiple choice question data. Most of the existing works apply naive methods to obtain multiple outputs with the sacrifice of the generation quality. In this paper, we propose an end-to-end one-to-many generation structure with mixture of experts (MoE) for DG, and explore how different data-to-expert routing strategies of MoE influence the performance of one-to-many generation models. Concretely, the model’s encoder calculates token-level attention vectors to mark important tokens from the source sequence, and the decoder generates multiple results with the guidance of the local attention. Moreover, we propose a minimal loss assignment mechanism and a stable routing strategy for diversity generation. Experimental results demonstrate that our proposed method is able to generate multiple distractors with good interpretability, which greatly outperforms the existing state-of-the-art DG models in quality, and achieves satisfactory lexical and semantic diversity.
What problem does this paper attempt to address?