Problems and Countermeasures in Natural Language Processing Evaluation

Qingxiu Dong,Zhifang Sui,Weidong Zhan,Baobao Chang
DOI: https://doi.org/10.48550/arXiv.2104.09712
2021-04-20
Computation and Language
Abstract:Evaluation in natural language processing guides and promotes research on models and methods. In recent years, new evalua-tion data sets and evaluation tasks have been continuously proposed. At the same time, a series of problems exposed by ex-isting evaluation have also restricted the progress of natural language processing technology. Starting from the concept, com-position, development and meaning of natural language evaluation, this article classifies and summarizes the tasks and char-acteristics of mainstream natural language evaluation, and then summarizes the problems and causes of natural language pro-cessing evaluation. Finally, this article refers to the human language ability evaluation standard, puts forward the concept of human-like machine language ability evaluation, and proposes a series of basic principles and implementation ideas for hu-man-like machine language ability evaluation from the three aspects of reliability, difficulty and validity.
What problem does this paper attempt to address?