Multimodal Sentiment Analysis Based on Transformer and Low-rank Fusion

Qishang Shan,Xiangsen Wei
DOI: https://doi.org/10.1109/CAC53003.2021.9727438
2021-10-22
Abstract:Human emotion expressions are usually expressed as multiple modalities of information, such as natural speech, facial expressions and vocal signals. The information of each modality contains the features of human emotion, and the mutual verification between different modal features can greatly improve the accuracy of sentiment analysis. Therefore, it is important for sentiment analysis to fully explore the features of multiple modalities and analyze the correlation between the features of different modalities. A fusion model MulT-LMF based on MulT and LMF is proposed in this paper, and the LMF model is applied to the output part of the MulT model. The MulT model helps to construct the interaction information among the modalities, while the LMF model helps to further extract the correlations among the modal features. The model in this paper is tested on public multimodal sentiment datasets, including CMU-MOSI, CMU-MOSEI and IEMOCAP. Experimental results show that the model achieves better results in most cases.
Computer Science
What problem does this paper attempt to address?