Interactive Fusion Network with Recurrent Attention for Multimodal Aspect-based Sentiment Analysis.

Jun Wang,Qianlong Wang,Zhiyuan Wen,Xingwei Liang,Ruifeng Xu
DOI: https://doi.org/10.1007/978-3-031-20503-3_24
2022-01-01
Abstract:The goal of multimodal aspect-based sentiment analysis is to comprehensively utilize data from different modalities (e.g.,, text and image) to identify aspect-specific sentiment polarity. Existing works have proposed many methods for fusing text and image information and achieved satisfactory results. However, they fail to filter noise in the image information and ignore the progressive learning process of sentiment features. To solve these problems, we propose an interactive fusion network with recurrent attention. Specifically, we first use two encoders to encode text and image data, respectively. Then we use the attention mechanism to obtain the semantic information of the image at the token level. Next, we employ GRU to filter out the noise in the image and fuse information from different modalities. Finally, we design a decoder with recurrent attention to progressively learn aspect-specific sentiment features for classification. The results on two Twitter datasets show that our method outperforms all baselines.
What problem does this paper attempt to address?