Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition

Zirun Guo,Tao Jin,Zhou Zhao

2024-07-07

Abstract:The development of multimodal models has significantly advanced multimodal sentiment analysis and emotion recognition. However, in real-world applications, the presence of various missing modality cases often leads to a degradation in the model's performance. In this work, we propose a novel multimodal Transformer framework using prompt learning to address the issue of missing modalities. Our method introduces three types of prompts: generative prompts, missing-signal prompts, and missing-type prompts. These prompts enable the generation of missing modality features and facilitate the learning of intra- and inter-modality information. Through prompt learning, we achieve a substantial reduction in the number of trainable parameters. Our proposed method outperforms other methods significantly across all evaluation metrics. Extensive experiments and ablation studies are conducted to demonstrate the effectiveness and robustness of our method, showcasing its ability to effectively handle missing modalities.

Computation and Language,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper primarily aims to address the issue of missing modalities in multimodal sentiment analysis and emotion recognition. Specifically, the paper proposes a novel multimodal Transformer framework that utilizes prompt learning to handle the situation of missing modalities. #### The main contributions are as follows: 1. **Proposed a new framework**: This framework uses prompt learning to address the problem of missing modalities in sentiment analysis and emotion recognition tasks. This method is not only computationally efficient but also effectively handles missing modalities during both training and testing phases. 2. **Parameter count is linearly related to the number of modalities**: The proposed three types of prompts (generation prompt, missing signal prompt, and missing type prompt) have a quantity that is linearly related to the number of modalities, significantly reducing the demand for computational resources. 3. **Proposed three types of prompts**: These prompts can generate missing information and learn both intra-modal and inter-modal information respectively. 4. **Performance on multiple datasets**: The proposed model significantly outperforms baseline methods on all evaluation metrics. Additionally, the study found that applying a 70% modality dropout rate during training can optimally enhance model performance. The effectiveness and robustness of this method have been validated through extensive experiments.

Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition

Multimodal Prompting with Missing Modalities for Visual Recognition

Deep Correlated Prompting for Visual Recognition with Missing Modalities

Towards Robust Multimodal Prompting With Missing Modalities

MuAP: Multi-step Adaptive Prompt Learning for Vision-Language Model with Missing Modality

Multimodal Prompt Transformer with Hybrid Contrastive Learning for Emotion Recognition in Conversation

Prompt Link Multimodal Fusion in Multimodal Sentiment Analysis

Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning

Few-shot Multimodal Sentiment Analysis based on Multimodal Probabilistic Fusion Prompts

Multimodal Sentiment Analysis with Missing Modality: A Knowledge-Transfer Approach

Toward Robust Multimodal Learning using Multimodal Foundational Models

Modality-invariant and Specific Prompting for Multimodal Human Perception Understanding

Tag-assisted Multimodal Sentiment Analysis under Uncertain Missing Modalities

Visual Prompt Flexible-Modal Face Anti-Spoofing

Multimodal Sentiment Analysis based on Supervised Contrastive Learning and Cross-modal Translation under Modalities Missing * .

Accommodating Missing Modalities in Time-Continuous Multimodal Emotion Recognition

Syntax-aware Hybrid prompt model for Few-shot multi-modal sentiment analysis

Contrastive Learning based Modality-Invariant Feature Acquisition for Robust Multimodal Emotion Recognition with Missing Modalities

A Sentimental Prompt Framework with Visual Text Encoder for Multimodal Sentiment Analysis

Prompt Learning for Multimodal Intent Recognition with Modal Alignment Perception

ModalPrompt:Dual-Modality Guided Prompt for Continual Learning of Large Multimodal Models