Low-rank tensor fusion and self-supervised multi-task multimodal sentiment analysis
Xinmeng Miao,Xuguang Zhang,Haoran Zhang
DOI: https://doi.org/10.1007/s11042-023-18032-8
IF: 2.577
2024-01-13
Multimedia Tools and Applications
Abstract:Multimodal sentiment analysis plays an important role in the field of smart education. To achieve high performance in Multimodal Sentiment Analysis (MSA) tasks, the model must effectively capture the information conveyed by individual modal representations. The primary objective is to learn the complementarity and correlation of the various modalities, however, existing methods often fall short in either capturing complementary information or relevant information. Therefore, it is crucial to address these challenges to improve the performance of MSA models. To address this problem, this paper proposes a multitask multimodal sentiment analysis framework based on low-rank tensor fusion and self-supervision. In this model, the combination of low-rank tensor fusion and Mish function is used to capture inter-modal correlation information, the combination of unimodal label generation module and Mish activation function is introduced to capture inter-modal complementary information. And introduce the principle of multi-task learning to combine the above two tasks, thus enhancing the ability to capture information. Furthermore, we conducted comprehensive experiments on two widely-used Multimodal Sentiment Analysis datasets, namely CMU-MOSI and CMU-MOSEI, to evaluate the performance of our proposed model. The experimental results demonstrate the effectiveness of our model in achieving advanced performance in MSA tasks.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering