Dual-Perspective Fusion Network for Aspect-Based Multimodal Sentiment Analysis

Di Wang,Changning Tian,Xiao Liang,Lin Zhao,Lihuo He,Quan Wang
DOI: https://doi.org/10.1109/tmm.2023.3321435
IF: 7.3
2024-01-01
IEEE Transactions on Multimedia
Abstract:Aspect-based multimodal sentiment analysis (ABMSA) is an important sentiment analysis task that analyses aspect-specific sentiment in data with different modalities (usually multimodal data with text and images). Previous works usually ignore the overall sentiment tendency when analyzing the sentiment of each aspect term. However, the overall sentiment tendency is highly correlated with aspect-specific sentiment. In addition, existing methods neglect to explore and make full use of the fine-grained multimodal information closely related to aspect terms. To address these limitations, we propose a dual-perspective fusion network (DPFN) that considers both global and local fine-grained sentiment information in multimodal data. From the global perspective, we use text-image caption pairs to obtain a global representation containing information about the overall sentiment tendencies. From the local fine-grained perspective, we construct two graph structures to explore the fine-grained information in texts and images. Finally, aspect-level sentiment polarities can be obtained by analyzing the combination of global and local fine-grained sentiment information. Experimental results on two multimodal Twitter datasets show that the proposed DPFN model outperforms state-of-the-art methods. The source code is publicly available at https://github.com/cntian0/DPFN .
What problem does this paper attempt to address?