Abstract:Question generation (QG) is a natural language processing (NLP) problem that aims to generate natural questions from a given sentence or paragraph. QG has many applications, especially in education. For example, QG can complement teachers’ efforts in creating assessment materials by automatically generating many related questions. QG can also be used to generate frequently asked question (FAQ) sets for business. Question answering (QA) can benefit from QG, where the training dataset of QA can be enriched using QG to improve the learning and performance of QA algorithms. However, most of the existing works and tools in QG are designed for English text. This paper presents the design of a web-based question generator for Chinese comprehension. The generator provides a user-friendly web interface for users to generate a set of wh-questions (i.e., what, who, when, where, why, and how) based on a Chinese text conditioned on a corresponding set of answer phrases. The web interface allows users to easily refine the answer phrases that are automatically generated by the web generator. The underlying question generation is based on the transformer approach, which was trained on a dataset combined from three publicly available Chinese reading comprehension datasets, namely, DRUD, CMRC2017, and CMRC2018. Linguistic features such as parts of speech (POS) and named-entity recognition (NER) are extracted from the text, which together with the original text and the answer phrases, are then fed into a machine learning algorithm based on a pre-trained mT5 model. The generated questions with answers are displayed in a user-friendly format, supplemented with the source sentences in the text used for generating each question. We expect the design of this web tool to provide insight into how Chinese question generation can be made easily accessible to users with low computer literacy.

CSFQGD: Chinese Sentence Fill-in-the-blank Question Generation Dataset for Examination

SC-Ques: A Sentence Completion Question Dataset for English as a Second Language Learners

Automatic Generation of Short Answer Questions for Reading Comprehension Assessment

EQG-RACE: Examination-Type Question Generation

CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations

Qsnail: A Questionnaire Dataset for Sequential Question Generation

DT-QDC: A Dataset for Question Comprehension in Online Test.

CALF: Benchmarking Evaluation of LFQA Using Chinese Examinations

An Automatic Question Generator for Chinese Comprehension

Difficulty Controllable Generation of Reading Comprehension Questions

A Study on the Validity of the Grammar Fill-in-the-blank Questions of the National English Paper for College Entrance Examination in 2020

Large-scale Cloze Test Dataset Created by Teachers

Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese Medical Exam Dataset

Towards Explainable Chinese Native Learner Essay Fluency Assessment: Dataset, Tasks, and Method

A Dataset of Open-Domain Question Answering with Multiple-Span Answers

A Method for Generating Course Test Questions Based on Natural Language Processing and Deep Learning

PsyQA: A Chinese Dataset for Generating Long Counseling Text for Mental Health Support

What Disease Does This Patient Have? A Large-Scale Open Domain Question Answering Dataset from Medical Exams

ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination

Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension

CGCE: A Chinese Generative Chat Evaluation Benchmark for General and Financial Domains