Towards Robust Multimodal Sentiment Analysis with Incomplete Data

Haoyu Zhang,Wenbin Wang,Tianshu Yu

2024-11-01

Abstract:The field of Multimodal Sentiment Analysis (MSA) has recently witnessed an emerging direction seeking to tackle the issue of data incompleteness. Recognizing that the language modality typically contains dense sentiment information, we consider it as the dominant modality and present an innovative Language-dominated Noise-resistant Learning Network (LNLN) to achieve robust MSA. The proposed LNLN features a dominant modality correction (DMC) module and dominant modality based multimodal learning (DMML) module, which enhances the model's robustness across various noise scenarios by ensuring the quality of dominant modality representations. Aside from the methodical design, we perform comprehensive experiments under random data missing scenarios, utilizing diverse and meaningful settings on several popular datasets (\textit{e.g.,} MOSI, MOSEI, and SIMS), providing additional uniformity, transparency, and fairness compared to existing evaluations in the literature. Empirically, LNLN consistently outperforms existing baselines, demonstrating superior performance across these challenging and extensive evaluation metrics.

Computation and Language,Artificial Intelligence,Multimedia

What problem does this paper attempt to address?

The problem that this paper attempts to solve is dealing with incomplete data in multimodal sentiment analysis (MSA). Specifically, the paper focuses on how to improve the robustness and accuracy of the model when facing such incomplete data in practical applications due to problems such as sensor failures or automatic speech recognition (ASR) - caused data missing. To meet this challenge, the author proposes a new method - the Language - dominated Noise - resistant Learning Network (LNLN). LNLN enhances the robustness of the model through the following mechanisms: 1. **Dominant Modality Correction (DMC)**: - It uses adversarial learning and dynamic weighted enhancement strategies to reduce the impact of noise on the dominant modality (i.e., the language modality). - Specific steps include Completeness Check and Proxy Dominant Feature Generation. 2. **Dominant Modality based Multimodal Learning (DMML)**: - It fuses the corrected dominant modality features with the auxiliary modality (visual and audio) features to achieve effective multimodal classification. 3. **Reconstructor**: - It is used to reconstruct missing information and further improve the robustness of the system. The paper demonstrates the superior performance of LNLN under different noise levels through experiments on multiple popular datasets (such as MOSI, MOSEI and SIMS). The experimental results show that LNLN performs excellently in dealing with data - missing problems and can effectively improve the accuracy and robustness of multimodal sentiment analysis.

Towards Robust Multimodal Sentiment Analysis with Incomplete Data

Robust-MSA: Understanding the Impact of Modality Noise on Multimodal Sentiment Analysis

Towards Robust Multimodal Sentiment Analysis under Uncertain Signal Missing

Toward Robust Incomplete Multimodal Sentiment Analysis via Hierarchical Representation Learning

MissModal: Increasing Robustness to Missing Modality in Multimodal Sentiment Analysis

Which is Making the Contribution: Modulating Unimodal and Cross-modal Dynamics for Multimodal Sentiment Analysis

Analyzing Modality Robustness in Multimodal Sentiment Analysis

DLF: Disentangled-Language-Focused Multimodal Sentiment Analysis

Modality translation-based multimodal sentiment analysis under uncertain missing modalities

Hierarchical denoising representation disentanglement and dual-channel cross-modal-context interaction for multimodal sentiment analysis

Similar Modality Completion-Based Multimodal Sentiment Analysis under Uncertain Missing Modalities

Enhancing Multimodal Sentiment Analysis for Missing Modality through Self-Distillation and Unified Modality Cross-Attention

Text-oriented Modality Reinforcement Network for Multimodal Sentiment Analysis from Unaligned Multimodal Sequences

Multimodal Sentiment Analysis With Two-Phase Multi-Task Learning

Learning Discriminative Multi-Relation Representations for Multimodal Sentiment Analysis

Multimodal Mutual Attention-Based Sentiment Analysis Framework Adapted to Complicated Contexts

Efficient Multimodal Transformer with Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis

Missing Modality meets Meta Sampling (M3S): An Efficient Universal Approach for Multimodal Sentiment Analysis with Missing Modality

A text guided multi-task learning network for multimodal sentiment analysis

Weakening the Dominant Role of Text: CMOSI Dataset and Multimodal Semantic Enhancement Network

Evaluation of data inconsistency for multi-modal sentiment analysis