Unveiling Comparative Sentiments in Vietnamese Product Reviews: A Sequential Classification Framework

Ha Le,Bao Tran,Phuong Le,Tan Nguyen,Dac Nguyen,Ngoan Pham,Dang Huynh
2024-01-02
Abstract:Comparative opinion mining is a specialized field of sentiment analysis that aims to identify and extract sentiments expressed comparatively. To address this task, we propose an approach that consists of solving three sequential sub-tasks: (i) identifying comparative sentence, i.e., if a sentence has a comparative meaning, (ii) extracting comparative elements, i.e., what are comparison subjects, objects, aspects, predicates, and (iii) classifying comparison types which contribute to a deeper comprehension of user sentiments in Vietnamese product reviews. Our method is ranked fifth at the Vietnamese Language and Speech Processing (VLSP) 2023 challenge on Comparative Opinion Mining (ComOM) from Vietnamese Product Reviews.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address the issue of comparative opinion mining in Vietnamese product reviews. Specifically, the research team proposes a sequential classification framework that includes three subtasks: 1. **Identifying Comparative Sentences**: Determining whether a sentence has a comparative meaning. 2. **Extracting Comparative Elements**: Extracting the subject, object, aspect, and predicate of the comparison from the sentence. 3. **Classifying Comparison Types**: Classifying the types of comparative sentences to better understand the sentiment expressed by users in product reviews. The method improves data quality by using pre-trained models such as PhoBERT, Electra, and multilingual BERT, combined with data augmentation techniques. The research team achieved 5th place in the comparative opinion mining task at the 2023 Vietnamese Language and Speech Processing (VLSP) Challenge. To address the challenges of low-resource languages (such as Vietnamese) in natural language processing, the method leverages various pre-trained models and supports model ensemble techniques. Experimental results show that on different versions of the dataset, the model's accuracy in identifying minority classes significantly improved through upsampling and bootstrapping. Future work directions include improving named entity recognition (NER) input tensors to better handle cases where a word may simultaneously serve as both subject and object in the same sentence.