Abstract:Recent advancements in Natural Language Processing (NLP) have impacted numerous sub-fields such as natural language generation, natural language inference, question answering, and more. However, in the field of question generation, the creation of distractors for multiple-choice questions (MCQ) remains a challenging task. In this work, we present a simple, generic framework for distractor generation using readily available Pre-trained Language Models (PLMs). Unlike previous methods, our framework relies solely on pre-trained language models and does not require additional training on specific datasets. Building upon previous research, we introduce a two-stage framework consisting of candidate generation and candidate selection. Our proposed distractor generation framework outperforms previous methods without the need for training or fine-tuning. Human evaluations confirm that our approach produces more effective and engaging distractors. The related codebase is publicly available at <a class="link-external link-https" href="https://github.com/obss/disgem" rel="external noopener nofollow">this https URL</a>.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is in the field of natural language processing (NLP), how to generate effective distractors for multiple - choice questions (MCQ). Specifically, the author proposes a simple and general framework based on pre - trained language models (PLMs) for generating distractors without the need for additional model training or fine - tuning. This framework aims to improve the quality of generated distractors, making them more challenging and attractive, thereby better evaluating students' reading comprehension and reasoning abilities.
### Main contributions of the paper:
1. **Generate diverse distractors similar to the correct answer**: This method can generate diverse distractors that are semantically similar but not identical to the correct answer, thus increasing the complexity of short - form extraction - type multiple - choice questions.
2. **Applicable to different fields and languages without specific training**: Since this method utilizes off - the - shelf pre - trained language models, it can be easily applied to different fields and languages without extensive data collection and model training.
3. **Structured framework design**: This framework provides a structured generation process that can be easily extended and modified to meet different needs, promoting the adaptability and extensibility of the method.
### Specific problems solved:
- **Generate high - quality distractors**: Traditional distractor generation methods rely on techniques such as part - of - speech tagging, thesaurus, and semantic analysis, and these methods often require additional data and training. The framework proposed in this paper utilizes the context - understanding ability of pre - trained language models to generate more natural and semantically coherent distractors.
- **Avoid generating distractors that are the same as the correct answer**: By using a natural language inference (NLI) model, this framework can filter out candidate distractors with the same or similar meaning as the correct answer during the generation stage, ensuring that the finally generated distractors do not make the questions invalid.
- **Ensure the diversity among distractors**: This framework also ensures through the NLI model that the generated distractors are not too similar to each other, thereby improving the overall quality of the questions.
### Method overview:
1. **Candidate Set Generator (CSG)**: Use a pre - trained language model to generate multiple candidate distractors. Through the masked language model (MLM) task, replace the words in the correct answer with mask tokens, and then generate multiple possible distractors.
2. **Distractor Selector (DS)**: Through a two - step screening process, select the final distractors from the candidate set. The first step is to use the NLI model to exclude candidates with the same meaning as the correct answer; the second step is to ensure that there is no repetition or excessive similarity among the generated distractors.
### Experimental results:
- **LLM evaluation**: Experiments were carried out using the SQuAD dataset, and the results showed that the DisGeM method proposed in this paper is significantly superior to previous methods (such as CDGP) in generating high - quality distractors.
- **Human evaluation**: By recruiting 30 human evaluators for testing, the results showed that the distractors generated by DisGeM received higher evaluations in terms of quality and difficulty, and the correct rate of the evaluators was relatively low, indicating that the generated distractors are more challenging.
In conclusion, this paper solves the difficult problem of generating effective distractors in multiple - choice questions by proposing a distractor - generation framework based on pre - trained language models, improving the complexity and evaluation effect of the questions.