Visual Sentiment Analysis by Leveraging Local Regions and Human Faces.

Ruolin Zheng,Weixin Li,Yunhong Wang
DOI: https://doi.org/10.1007/978-3-030-37731-1_25
2019-01-01
Abstract:Visual sentiment analysis (VSA) is a challenging task which attracts wide attention from researchers for its great application potentials. Existing works for VSA mostly extract global representations of images for sentiment prediction, ignoring the different contributions of local regions. Some recent studies analyze local regions separately and achieve improvements on the sentiment prediction performance. However, most of them treat regions equally in the feature fusion process which ignores their distinct contributions or use a global attention map whose performance is easily influenced by noises from non-emotional regions. In this paper, to solve these problems, we propose an end-to-end deep framework to effectively exploit the contributions of local regions to VSA. Specifically, a Sentiment Region Attention (SRA) module is proposed to estimate contributions of local regions with respect to the global image sentiment. Features of these regions are then reweighed and further fused according to their estimated contributions. Moreover, since the image sentiment is usually closely related to humans appearing in the image, we also propose to model the contribution of human faces as a special local region for sentiment prediction. Experimental results on publicly available and widely used datasets for VSA demonstrate our method outperforms state-of-the-art algorithms.
What problem does this paper attempt to address?