Abstract:These days, social media users tend to express their feelings through sharing images online. Capturing the emotions embedded in these social images involves great research challenges and practical values. Most existing works concentrate on extracting the visual feature from a global view, while ignoring the fact that visual objects are also rich in emotion. How to leverage the multilevel visual features to improve the sentiment analysis performance is important yet challenging. Besides, existing works view each social image as an independent sample while ignoring the rich correlations among social images, which may be helpful in detecting visual emotion. In this article, we propose a novel model called social relations-guided multiattention networks (SRGMANs) to incorporate both the multilevel (region-level and object-level) visual features of a single image and the correlations among multiple social images to conduct visual sentiment analysis. Specifically, we first construct a heterogeneous network consisting of various types of social relations and introduce a heterogeneous network embedding method to learn the network representation for each image. Then, two visual attention branches (region attention network and object attention network) are devised to extract emotional and discriminative visual features. For each branch, we design a self-attention module to capture the emotional dependencies among visual parts. Besides, a network-guided attention module is also designed in each branch to focus on more network-related emotional visual parts with the guidance of the topology information. Finally, the attended visual features from the two attention models, together with network representation features, are combined within a holistic framework to predict the sentiment of social images. Extensive experiments demonstrate the superiority of our model on three benchmark datasets.

SentiNet: Mining Visual Sentiment from Scratch.

Visual Exploration of Internet News Via Sentiment Score and Topic Models

MultiSentiNet: A Deep Semantic Network for Multimodal Sentiment Analysis

Beyond Object Recognition: Visual Sentiment Analysis with Deep Coupled Adjective and Noun Neural Networks

DeepSentiBank: Visual Sentiment Concept Classification with Deep Convolutional Neural Networks

Visual sentiment analysis based on image caption and adjective–noun–pair description

SentiNet: A Robust and Multilingual Sentiment Analysis System with Transfer Learning and Adversarial Training Techniques

SEntiMoji: An Emoji-Powered Learning Approach for Sentiment Analysis in Software Engineering

Visual sentiment analysis with semantic correlation enhancement

MSNet: A Deep Architecture Using Multi-Sentiment Semantics for Sentiment-Aware Image Style Transfer

VisdaNet: Visual Distillation and Attention Network for Multimodal Sentiment Classification

SentiCap: Generating Image Descriptions with Sentiments

Sentiment Prediction in Scene Images Via Convolutional Neural Networks

SentiLR: Linguistic Knowledge Enhanced Language Representation for Sentiment Analysis

WeaveNet: End-to-End Audiovisual Sentiment Analysis

Visual Sentiment Analysis With Social Relations-Guided Multiattention Networks

SECT: Sentiment-Enriched Continual Training for Image Sentiment Analysis.

Visual Sentiment Analysis Via Deep Multiple Clustered Instance Learning.

Visual Sentiment Analysis Using Deep Learning Models with Social Media Data

Sentiment Analysis Based on Heterogeneous Multi-Relation Signed Network

OutdoorSent: Sentiment Analysis of Urban Outdoor Images by Using Semantic and Deep Features