From Big to Small Without Losing It All: Text Augmentation with ChatGPT for Efficient Sentiment Analysis

Stanisław Woźniak,Jan Kocoń
DOI: https://doi.org/10.48550/arXiv.2312.04720
2023-12-08
Abstract:In the era of artificial intelligence, data is gold but costly to annotate. The paper demonstrates a groundbreaking solution to this dilemma using ChatGPT for text augmentation in sentiment analysis. We leverage ChatGPT's generative capabilities to create synthetic training data that significantly improves the performance of smaller models, making them competitive with, or even outperforming, their larger counterparts. This innovation enables models to be both efficient and effective, thereby reducing computational cost, inference time, and memory usage without compromising on quality. Our work marks a key advancement in the cost-effective development and deployment of robust sentiment analysis models.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on the following aspects: 1. **High cost of data annotation**: In the era of artificial intelligence, data is a precious resource, but the cost of collecting and annotating a large amount of data is very high and time - consuming, which has become a bottleneck in the development and deployment of AI technologies. The paper proposes an innovative method of using ChatGPT for text augmentation to generate synthetic training data, thereby reducing the dependence on a large amount of annotated data. 2. **Performance improvement of small - scale models**: The paper explores how to improve the performance of small - scale models in sentiment analysis tasks by using the synthetic data generated by ChatGPT, enabling them to be comparable to or even surpass large, resource - intensive models. This not only improves the performance of the model, but also reduces the computational cost, inference time and memory usage, achieving efficient and economical model development and deployment. 3. **Limitations of data augmentation techniques**: Traditional data augmentation techniques usually rely on simple heuristic methods or rule - based systems, and the application scope of these methods is limited and the effect is not ideal. The paper proposes to use large - language models (such as ChatGPT) as a powerful engine for generating high - quality synthetic data, overcoming the limitations of traditional methods. 4. **Balance between model efficiency and performance**: The paper experimentally verifies whether the data enhanced by ChatGPT can improve the performance of small - scale models, and explores the applicability of this enhancement method to models of different scales, from smaller to larger architectures can all benefit. Specifically, the paper attempts to solve the above problems in the following ways: - **Using ChatGPT for text augmentation**: By designing different prompts, generate synonymous sentences of the original data or completely new texts while maintaining the original sentiment unchanged. - **Experimental verification**: Using two sentiment analysis datasets (PerSenT and MultiEmo), and three different Transformer models (RoBERTa - small, RoBERTa - base and XtremeDistil), extensive experiments were carried out to evaluate the effects of different data augmentation strategies. - **Performance metrics**: Through metrics such as accuracy, F1 macro and gain, the performance improvement of the model after using the enhanced data was comprehensively evaluated. In conclusion, this paper aims to provide an efficient, economical and effective solution by using the generation ability of ChatGPT to address the problems of high cost of data annotation and insufficient performance of small - scale models.