Abstract:Social text is a kind of user-generated data on social media, which can reflect the opinions and emotions of netizens in their social activities. The study about emotion recognition for social text could help us clarify netizens’ emotional position on certain issues, products, or services to support opinion monitoring, marketing management and so on. But with the continuous improvement of information technology, the mode of emotional expression has become increasingly complex and diverse. For the sake of concise and comprehensive implications, social texts are usually very short and noisy, which are often accompanied with visual context and pictorial multi-grained semantics. Therefore, traditional research strategies based on plain text analysis have great limitations faced with those contents. To this end, we in this article focus on the semantic and emotional uncertainty in social texts through taking into account the textual content of social text and its related non-text information to improve emotion semantic representations for social text emotion recognition. Specifically, to realize the visual context modeling, we first leverage the visual information associated with social text as the contextual supplement to enhance the emotional semantics of short social text. Then, we model the semantics of social text with different granularity to fully mine the limited information of social text. After that, with the help of visual context enhancement and multi-grained semantic mining, we are capable of alleviating the limited expression of social text and its uncertainty of semantics and emotions. This could represent the emotion semantics of social text effectively for its emotion recognition. Finally, extensive experiments on the real multi-modal dataset demonstrate that our proposed method has promising results with high efficiency.

Fine-grained Sentiment Feature Extraction Method for Cross-modal Sentiment Analysis

Sentiment Analysis Using Deep Robust Complementary Fusion of Multi-Features and Multi-Modalities.

Target-oriented Sentiment Classification with Sequential Cross-modal Semantic Graph

MFSC: A Multimodal Aspect-Level Sentiment Classification Framework with Multi-Image Gate and Fusion Networks

Attention-based multi-level image and text sentiment analysis

Visual-Textual Sentiment Analysis Enhanced by Hierarchical Cross-Modality Interaction

Multimodal Sentiment Analysis of Graphic Texts Based on Multicategorical Relative Fusion

Research on Image-text Multimodal Emotions Analysis with Fused Emoji

MIECF: Multi-faceted information extraction and cross-mixture fusion for multimodal aspect-based sentiment analysis

Feature Extraction Network with Attention Mechanism for Data Enhancement and Recombination Fusion for Multimodal Sentiment Analysis

Social Image-text Sentiment Classification With Cross-Modal Consistency and Knowledge Distillation

Social Image Sentiment Analysis by Exploiting Multimodal Content and Heterogeneous Relations

A Multimodal Sentiment Analysis Approach Based on a Joint Chained Interactive Attention Mechanism

Exploiting Visual Context and Multi-grained Semantics for Social Text Emotion Recognition

A Multimodal Sentiment Analysis Method Integrating Multi-Layer Attention Interaction and Multi-Feature Enhancement

Multimodal Sentiment Analysis Using Multi-tensor Fusion Network with Cross-modal Modeling

Cross-Modal Sentiment Analysis of Text and Video Based on Bi-GRU Cyclic Network and Correlation Enhancement

Cross-modal image sentiment analysis via deep correlation of textual semantic

Multimodal Sentiment Analysis With Image-Text Interaction Network

Various syncretic co‐attention network for multimodal sentiment analysis

A supervised contrastive learning-based model for image emotion classification