Evaluation Metrics for Text Data Augmentation in NLP

Marcellus Amadeus, William Alberto Cruz Castañeda
2024-02-10
Abstract:Recent surveys on data augmentation for natural language processing have reported different techniques and advancements in the field. Several frameworks, tools, and repositories promote the implementation of text data augmentation pipelines. However, a lack of evaluation criteria and standards for method comparison due to different tasks, metrics, datasets, architectures, and experimental settings makes comparisons meaningless. Also, a lack of methods unification exists and text data augmentation research would benefit from unified metrics to compare different augmentation methods. Thus, academics and the industry endeavor relevant evaluation metrics for text data augmentation techniques. The contribution of this work is to provide a taxonomy of evaluation metrics for text augmentation methods and serve as a direction for a unified benchmark. The proposed taxonomy organizes categories that include tools for implementation and metrics calculation. Finally, with this study, we intend to present opportunities to explore the unification and standardization of text data augmentation metrics.
Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
This paper aims to address the issue of the lack of unified evaluation standards in the field of Text Data Augmentation (TDA). Specifically: 1. **Lack of Evaluation Standards**: Due to different tasks, metrics, datasets, architectures, and experimental settings, it is difficult to effectively compare various current text data augmentation methods. 2. **Insufficient Method Uniformity**: There is a lack of unified standards among existing text data augmentation techniques, making it challenging for researchers to determine which method is more effective. To solve these problems, the paper proposes a taxonomy of evaluation metrics for text data augmentation methods, aiming to provide guidance for unified benchmarking. This taxonomy includes the tools and resources needed to implement and compute these metrics. Through this study, the authors hope to promote the unification and standardization of text data augmentation metrics, thereby advancing the progress and development of the natural language processing field.