Data-driven Natural Language Generation: Paving the Road to Success

Jekaterina Novikova,Ondřej Dušek,Verena Rieser
DOI: https://doi.org/10.48550/arXiv.1706.09433
2017-06-28
Computation and Language
Abstract:We argue that there are currently two major bottlenecks to the commercial use of statistical machine learning approaches for natural language generation (NLG): (a) The lack of reliable automatic evaluation metrics for NLG, and (b) The scarcity of high quality in-domain corpora. We address the first problem by thoroughly analysing current evaluation metrics and motivating the need for a new, more reliable metric. The second problem is addressed by presenting a novel framework for developing and evaluating a high quality corpus for NLG training.
What problem does this paper attempt to address?