A Graph Convolution-Based Heterogeneous Fusion Network for Multimodal Sentiment Analysis

Tong Zhao,Junjie Peng,Yansong Huang,Lan Wang,Huiran Zhang,Zesu Cai
DOI: https://doi.org/10.1007/s10489-023-05151-w
IF: 5.3
2023-01-01
Applied Intelligence
Abstract:Multimodal sentiment analysis leverages various modalities, including text, audio, and video, to determine human sentiment tendencies, which holds significance in fields such as intention understanding and opinion analysis. However, there are two critical challenges in multimodal sentiment analysis: one is how to effectively extract and integrate information from various modalities, which is important for reducing the heterogeneity gap among modalities; the other is how to overcome the problem of information forgetting while modelling long sequences, which leads to significant information loss and adversely affect the fusion performance of modalities. Based on the above issues, this paper proposes a multimodal heterogeneity fusion network based on graph convolutional neural networks (HFNGC). A shared convolutional aggregation mechanism is used to overcome the semantic gap among modalities and reduce the noise effect caused by modality heterogeneity. In addition, the model applies Dynamic Routing to convert modality features into graph structures. By learning semantic information in the graph representation space, our model can improve the capability of remote-dependent learning. Furthermore, the model integrates complementary information among modalities and explores the intra- and inter-modal interactions during the modality fusion stage. To validate the effectiveness of our model, we conduct experiments on two benchmark datasets. The experimental results demonstrate that our method outperforms the existing methods, exhibiting strong generalisation capability and high competitiveness.
What problem does this paper attempt to address?