Abstract:Aspect-level sentiment analysis within multimodal contexts, focusing on the precise identification and interpretation of sentiment attitudes linked to the target aspect across diverse data modalities, remains a focal research area that perpetuates the advancement of discourse and innovation in artificial intelligence. However, most existing methods tend to focus on extracting visual features from only one facet, such as face expression, which ignores the value of information from other key facets, such as the textual information presented by the image modality, resulting in information loss. To overcome the aforementioned constraint, we put forth a novel approach designated as Multi-faceted Information Extraction and Cross-mixture Fusion (MIECF) for Multimodal Aspect-based Sentiment Analysis. Our approach captures more comprehensive visual information in the image and integrates these local and global key features from multiple facets. Local features, such as facial expressions and textual features, provide direct and rich emotional cues. By contrast, the global feature often reflects the overall emotional atmosphere and context. To enhance the visual representation, we designed a Cross-mixture Fusion method to integrate this local and global multimodal information. In particular, the method establishes semantic relationships between local and global features to eliminate ambiguity brought by single-facet information and achieve more accurate contextual understanding, providing a richer and more precise manner for sentiment analysis. The experimental findings indicate that our proposed approach achieves a leading level of performance, resulting in an Accuracy of 79.65 % on the Twitter-2015 dataset, and Macro-F1 scores of 75.90 % and 73.11 % for the Twitter-2015 and Twitter-2017 datasets, respectively.

FMCF: Few-shot Multimodal Aspect-Based Sentiment Analysis Framework Based on Contrastive Finetuning

Sentiment Analysis Using Deep Robust Complementary Fusion of Multi-Features and Multi-Modalities.

MSFNet: modality smoothing fusion network for multimodal aspect-based sentiment analysis

Hierarchical Fusion Network with Enhanced Knowledge and Contrastive Learning for Multimodal Aspect-Based Sentiment Analysis on Social Media

Multi-Grained Fusion Network with Self-Distillation for Aspect-Based Multimodal Sentiment Analysis

Aspects Are Anchors: Towards Multimodal Aspect-based Sentiment Analysis Via Aspect-driven Alignment and Refinement

MIECF: Multi-faceted information extraction and cross-mixture fusion for multimodal aspect-based sentiment analysis

Cross-modal fine-grained alignment and fusion network for multimodal aspect-based sentiment analysis

MFSC: A Multimodal Aspect-Level Sentiment Classification Framework with Multi-Image Gate and Fusion Networks

CLMLF:A Contrastive Learning and Multi-Layer Fusion Method for Multimodal Sentiment Detection

Self-adaptive attention fusion for multimodal aspect-based sentiment analysis

Multi-level textual-visual alignment and fusion network for multimodal aspect-based sentiment analysis

M2DF: Multi-grained Multi-curriculum Denoising Framework for Multimodal Aspect-based Sentiment Analysis

Few-shot Joint Multimodal Aspect-Sentiment Analysis Based on Generative Multimodal Prompt

A Contrastive Cross-Channel Data Augmentation Framework for Aspect-based Sentiment Analysis

AMIFN: Aspect-guided Multi-view Interactions and Fusion Network for Multimodal Aspect-based Sentiment Analysis

Target-oriented Sentiment Classification with Sequential Cross-modal Semantic Graph

Targeted Aspect-Based Multimodal Sentiment Analysis: An Attention Capsule Extraction and Multi-Head Fusion Network

An Interactive Attention Mechanism Fusion Network for Aspect-Based Multimodal Sentiment Analysis

A Multimodal Sentiment Analysis Method Integrating Multi-Layer Attention Interaction and Multi-Feature Enhancement

Multimodal Sentiment Analysis Representations Learning via Contrastive Learning with Condense Attention Fusion