Vietnamese Sentiment Analysis: An Overview and Comparative Study of Fine-tuning Pretrained Language Models

Dang Van Thin,Duong Ngoc Hao,Ngan Luu-Thuy Nguyen
DOI: https://doi.org/10.1145/3589131
IF: 1.471
2023-04-04
ACM Transactions on Asian and Low-Resource Language Information Processing
Abstract:Sentiment Analysis (SA) is one of the most active research areas in the Natural Language Processing (NLP) field due to its potential for business and society. With the development of language representation models, numerous methods have shown promising efficiency in fine-tuning pre-trained language models in NLP downstream tasks. For Vietnamese, many available pre-trained language models were also released, including the monolingual and multilingual language models. Unfortunately, all of these models were trained on different architectures, pre-trained data and pre-processing steps; consequently, fine-tuning these models can be expected to yield different effectiveness. In addition, there is no study focusing on evaluating the performance of these models on the same datasets for the SA task up to now. This paper presents a fine-tuning approach to investigate the performance of different pre-trained language models for the Vietnamese SA task. The experimental results show the superior performance of the monolingual PhoBERT model and ViT5 model in comparison with previous studies and provide new state-of-the-art (SOTA) performances on five benchmark Vietnamese SA datasets. To the best of our knowledge, our study is the first attempt to investigate the performance of fine-tuning Transformer-based models on five datasets with different domains and sizes for the Vietnamese SA task.
computer science, artificial intelligence
What problem does this paper attempt to address?