FMCF: Few-shot Multimodal Aspect-Based Sentiment Analysis Framework Based on Contrastive Finetuning

Yongping Du,Runfeng Xie,Bochao Zhang,Zihao Yin
DOI: https://doi.org/10.1007/s10489-024-05841-z
IF: 5.3
2024-01-01
Applied Intelligence
Abstract:Multimodal aspect-based sentiment analysis (MABSA) aims to predict the sentiment of aspect by the fusion of different modalities such as image, text and so on. However, the availability of high-quality multimodal data remains limited. Therefore, few-shot MABSA is a new challenge. Previous works are rarely able to cope with low-resource and few-shot scenarios. In order to address the above problems, we design a Few-shot Multimodal aspect-based sentiment analysis framework based on Contrastive Finetuning (FMCF). Initially, the image modality is transformed to the corresponding textual caption to achieve the entailed semantic information and a contrastive dataset is constructed based on similarity retrieval for finetuning in the following stage. Further, a sentence encoder is trained based on SBERT, which combines supervised contrastive learning and sentence-level multi-feature fusion to complete MABSA. The experiments demonstrate that our framework achieves excellent performance in the few-shot scenarios. Importantly, with only 256 training samples and limited computational resources, the proposed method outperforms fine-tuned models that use all available data on the Twitter dataset.
What problem does this paper attempt to address?