Large-Scale Simple Question Generation by Template-Based Seq2seq Learning.

Tianyu Liu,Bingzhen Wei,Baobao Chang,Zhifang Sui
DOI: https://doi.org/10.1007/978-3-319-73618-1_7
2017-01-01
Abstract:Numerous machine learning tasks achieved substantial advances with the help of large-scale supervised learning corpora over past decade. However, there's no large-scale question-answer corpora available for Chinese question answering over knowledge bases. In this paper, we present a 28M Chinese Q&A corpora based on the Chinese knowledge base provided by NLPCC2017 KBQA challenge. We propose a novel neural network architecture which combines template-based method and seq2seq learning to generate highly fluent and diverse questions. Both automatic and human evaluation results show that our model achieves outstanding performance (76.8 BLEU and 43.1 ROUGE). We also propose a new statistical metric called DIVERSE to measure the linguistic diversity of generated questions and prove that our model can generate much more diverse questions compared with other baselines.
What problem does this paper attempt to address?