Multimodal Sentiment Analysis With Two-Phase Multi-Task Learning

Bo Yang,Lijun Wu,Jinhua Zhu,Bo Shao,Xiaola Lin,Tie-Yan Liu
DOI: https://doi.org/10.1109/taslp.2022.3178204
2022-01-01
Abstract:Multimodal Sentiment Analysis (MSA) is a challenging research area that studies sentiment expressed from multiple heterogeneous modalities. Given those pre-trained language models such as BERT have shown state-of-the-art (SOTA) performance in multiple NLP disciplines, existing models tend to integrate these modalities into BERT and treat the MSA as a single prediction task. However, we find that simply fusing the multimodal features into BERT cannot well establish the power of a strong pre-trained model. Besides, the classification ability of each modality is also suppressed by single-task learning. In this paper, we proposes a multimodal framework named Two-Phase Multi-task Sentiment Analysis (TPMSA). It applies a two-phase training strategy to make the most of the pre-trained model and a novel multi-task learning strategy to investigate the classification ability of each representation. We conducted experiments on two multimodal benchmark datasets, CMU-MOSI and CMU-MOSEI. The results show that our TPMSA model outperforms the current SOTA method on both datasets across most of the metrics, clearly showing our proposed methods effectiveness.
engineering, electrical & electronic,acoustics
What problem does this paper attempt to address?