Dynamic Adaptive Optimization for Effective Sentiment Analysis Fine-Tuning on Large Language Models

Hongcheng Ding,Xuanze Zhao,Shamsul Nahar Abdullah,Deshinta Arrova Dewi,Zixiao Jiang
2024-08-16
Abstract:Sentiment analysis plays a crucial role in various domains, such as business intelligence and financial forecasting. Large language models (LLMs) have become a popular paradigm for sentiment analysis, leveraging multi-task learning to address specific tasks concurrently. However, LLMs with fine-tuning for sentiment analysis often underperforms due to the inherent challenges in managing diverse task complexities. Moreover, constant-weight approaches in multi-task learning struggle to adapt to variations in data characteristics, further complicating model effectiveness. To address these issues, we propose a novel multi-task learning framework with a dynamic adaptive optimization (DAO) module. This module is designed as a plug-and-play component that can be seamlessly integrated into existing models, providing an effective and flexible solution for multi-task learning. The key component of the DAO module is dynamic adaptive loss, which dynamically adjusts the weights assigned to different tasks based on their relative importance and data characteristics during training. Sentiment analyses on a standard and customized financial text dataset demonstrate that the proposed framework achieves superior performance. Specifically, this work improves the Mean Squared Error (MSE) and Accuracy (ACC) by 15.58% and 1.24% respectively, compared with previous work.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address performance issues encountered when fine-tuning large language models (LLMs) for sentiment analysis, particularly when using multi-task learning (MTL) methods. Specifically, the paper focuses on the following key issues: 1. **Task Complexity Management in Multi-Task Learning**: - LLMs often perform poorly in sentiment analysis due to varying complexities of different tasks. - Common fixed-weight methods in multi-task learning struggle to adapt to changes in data characteristics, further affecting model effectiveness. 2. **Data Distribution Imbalance**: - Sample distribution in each training batch can be highly uneven, leading to overfitting or underfitting for certain categories, thus impacting overall performance and generalization ability. 3. **Differences in Task Difficulty**: - Loss functions for regression and classification tasks can differ significantly in magnitude, causing one task to dominate the training process and hindering effective learning and generalization across all tasks. To address these challenges, the authors propose a new multi-task learning framework that introduces a Dynamic Adaptive Optimization (DAO) module. This module dynamically adjusts the weights of different tasks, solving the aforementioned issues and improving model performance in sentiment analysis tasks. ### Main Contributions 1. **Identifying Performance Gaps**: - The paper identifies performance gaps when using fixed loss weights for fine-tuning LLMs in the financial domain with multi-task learning. 2. **Designing the DAO Module**: - The authors design a plug-and-play DAO module that dynamically adjusts the weights of different tasks based on the relevance and importance of each batch of data and the difficulty of the tasks. 3. **Reducing Fine-Tuning Computational Overhead**: - By using LoRA technology to fine-tune the RoBERTa-Large model with different parameters, the computational overhead of fine-tuning is reduced without introducing inference latency. 4. **Performance Improvement**: - Experimental results show that the proposed framework improves mean squared error (MSE) and accuracy (ACC) by 15.58% and 1.24%, respectively, outperforming existing methods. ### Motivation In the task of sentiment polarity analysis of exchange rate texts, the authors found that the RoBERTa-Large model fine-tuned on standard and customized financial text datasets performed poorly. This is mainly because the model lacks exposure to domain-specific texts in the news domain, which contain specialized terminology, implicit sentiments, and subtle variations. Using the Twitter-RoBERTa-Large model fine-tuned on a Twitter news dataset improved performance, but there is still room for improvement. ### Methodology 1. **Traditional Sentiment Analysis**: - Using RoBERTa-Large as an embedding generator for text data, combined with task-specific heads for sentiment analysis. 2. **Multi-Task Learning**: - Introducing a combination of classification and regression tasks to better capture sentiment tendencies in the text and alleviate data sparsity issues. 3. **Dynamic Adaptive Optimization (DAO)**: - Balancing the contributions of different tasks through a gradient-weighting method and addressing data distribution imbalance with regularization terms and category-specific weights. 4. **LoRA Technology**: - Injecting low-rank decomposition matrices into each layer to reduce the number of parameters while maintaining performance. ### Evaluation The paper conducts experiments using standard and customized financial text datasets to validate the effectiveness of the proposed framework. Experimental results demonstrate that the combination of the DAO module and LoRA technology significantly enhances model performance in sentiment analysis tasks.