Co-attention Guided Local-Global Feature Fusion for Aspect-Level Multimodal Sentiment Analysis.

Guoyong Cai,Shunjie Wang,Guangrui Lv
DOI: https://doi.org/10.1007/978-981-99-8429-9_30
2024-01-01
Abstract:Aspect-level multimodal sentiment analysis is a target oriented fine-grained sentiment analysis task aimed at determining the sentiment polarity of a given aspect of a sentence in conjunction with relevant multimodal data. Multimodal alignment and fusion remains a challenge for this task, and this paper proposes to solve this issue by considering the inter-modal local interactions. Therefore, a co-attention guided local-global feature fusion (CLGFF) method is proposed. The CLGFF method mines both aspect-guided global multimodal features and local fine-grained alignment between multimodalities, and then fuses them together for better exploitation of the global-local semantic correlation. A large number of experiments are carried out on two aspect-level multimodal sentiment datasets. A series of methods are compared from the experiments, and the results show that the proposed CLGFF method can better capture the local semantic correlation within the modality and the fine-grained consistency between different modalities, thereby improving the performance of aspect-level multimodal sentiment analysis.
What problem does this paper attempt to address?