A Optimized BERT for Multimodal Sentiment Analysis

Jun Wu,Tianliang Zhu,Jiahui Zhu,Tianyi Li,Chunzhi Wang

DOI: https://doi.org/10.1145/3566126

2023-02-17

Abstract:Sentiment analysis of one modality (e.g., text or image) has been broadly studied. However, not much attention has been paid to the sentiment analysis of multi-modal data. As the research on and applications of multi-modal data analysis are becoming more and more broad, it is necessary to optimize BERT internal structure. This article proposes a hierarchical multi-head self-attention and gate channel BERT, which is an optimized BERT model. The model is composed of three modules: the hierarchical multi-head self-attention module realizes the hierarchical extraction process of features; the gate channel module replaces BERT’s original Feed Forward layer to realize information filtering; and the tensor fusion model based on a self-attention mechanism is utilized to implement the fusion process of different modal features. Experiments show that our method achieves promising results and improves accuracy by 5–6% when compared with traditional models on the CMU-MOSI dataset.

computer science, information systems, theory & methods, software engineering

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to address issues in multimodal sentiment analysis. Specifically, while sentiment analysis of unimodal data (such as text or images) has been extensively studied, sentiment analysis of multimodal data has received less attention. As research and application of multimodal data analysis become increasingly widespread, optimizing the BERT model to handle multimodal data becomes particularly important. ### Main Contributions 1. **Hierarchical Multi-Head Self-Attention Mechanism**: Achieves hierarchical extraction of data features by using a hierarchical multi-head self-attention mechanism. 2. **Gated Channels**: Replaces the original feedforward layer in the BERT model with gated channels to achieve information filtering. 3. **HG-BERT Model**: Proposes an optimized BERT model that combines hierarchical multi-head self-attention mechanisms and gated channels. 4. **Self-Attention-Based Feature Fusion**: Achieves information interaction between different modal features through a tensor fusion model based on the self-attention mechanism. ### Experimental Results The experimental results show that the HG-BERT model outperforms traditional models on the CMU-MOSI dataset, with an accuracy improvement of 5–6%. This indicates that the optimized BERT model has better performance in multimodal sentiment analysis tasks. ### Conclusion This paper proposes an optimized BERT-based model, HG-BERT, which significantly enhances the performance of multimodal sentiment analysis through hierarchical multi-head self-attention mechanisms, gated channels, and self-attention-based feature fusion methods. Future research can further optimize the design of head distribution and gating mechanisms.

A Optimized BERT for Multimodal Sentiment Analysis

CM-BERT

Modality-invariant Temporal Representation Learning for Multimodal Sentiment Classification

MF-BERT: Multimodal Fusion in Pre-Trained BERT for Sentiment Analysis

Multimodal Sentiment Analysis Based on a Cross-Modal Multihead Attention Mechanism

Exploring Multimodal Sentiment Analysis via CBAM Attention and Double-layer BiLSTM Architecture

A cognitive brain model for multimodal sentiment analysis based on attention neural networks

Multimodal sentiment analysis based on multi-head attention mechanism

AFR-BERT: Attention-based mechanism feature relevance fusion multimodal sentiment analysis model

Multimodal Sentiment Analysis Based on Cross-Modal Attention and Gated Cyclic Hierarchical Fusion Networks

A Multimodal Sentiment Analysis Method Integrating Multi-Layer Attention Interaction and Multi-Feature Enhancement

Multimodal Sentiment Analysis Based on Information Bottleneck and Attention Mechanisms

Multi-Feature Fusion Multi-Modal Sentiment Analysis Model Based on Cross-Attention Mechanism

Multimodal sentiment analysis based on multiple attention

Adapted Multimodal BERT with Layer-wise Fusion for Sentiment Analysis

Multimodal Sentiment Analysis Based on BERT and ResNet

Adapting BERT for Target-Oriented Multimodal Sentiment Classification

Context-Dependent Multimodal Sentiment Analysis Based on a Complex Attention Mechanism

A cross modal hierarchical fusion multimodal sentiment analysis method based on multi-task learning

Video Sentiment Analysis with Bimodal Information-augmented Multi-Head Attention

Multimodal Sentiment Analysis Method Based on Hierarchical Adaptive Feature Fusion Network