Multimodal Sentiment Analysis Based on Information Bottleneck and Attention Mechanisms

Zhendong Wu,Rong Han,Yu Tang,Xiangrui Chen
DOI: https://doi.org/10.1109/CBASE60015.2023.10439124
2023-11-03
Abstract:Due to the lack of efficient methods to deal with data noise and information redundancy in the existing multimodal sentiment analysis models, there is a lot of task-irrelevant noise that occurs downstream during modal fusion. Additionally, there is insufficient information fusion between individual modalities and little intermodal interaction. This research suggests a multimodal sentiment analysis network model based on information bottleneck and cross-attention mechanisms (IBCAM) in order to address the aforementioned issues. The model is based on the notion of the information bottleneck, and a mutual information optimizer is created with the goal of filtering the noise and redundant information present in various modal data, including text, image, and audio. Through the cross-attention mechanism and gated weight adjustment module, the multimodal feature input is oriented to capture common information, enhance data association, align sentiment features, optimize multimodal representation vectors, and thus improve the accuracy of sentiment analysis. The model described in this study outperforms various current baseline models, according to comparative experiments on two open datasets, CMU-MOSI and CMU-MOSEI.
Computer Science
What problem does this paper attempt to address?