Robust Local-Global Feature Representation for Image Emotion Distribution Learning

Guanghua Li,Jianhua Liu,Juan Yang,Risheng Xu,Linbo Qing
DOI: https://doi.org/10.1145/3604078.3604092
2023-01-01
Abstract:With the development of online platforms and the increasing social needs of the public, the amount of multimedia data is growing rapidly, especially images. Therefore, it is significant to recognize the emotions represented by them. An image may contain multiple emotions because different individuals may have diverse feelings about the same content. Currently, most works focus on the intrinsic connections among multiple emotion labels to address the unconsidered emotion distribution problem in the traditional single-label methods. However, the image features that truly represent emotions are not taken into account. So in this paper, we extract representative local and global features of images and fuse them, together with the manner of label distribution learning. On one hand, local features are extracted by a spatial and channel-enhanced ConvNeXt network. On the other hand, to avoid the inductive biases of Convolutional Neural Networks (CNNs), we propose a relational reasoning VIT network to extract global features and enhance the connections between different regions of images. The experiments achieve excellent results on the Flickr_LDL and Twitter_LDL datasets, which proves the effectiveness of this method.
What problem does this paper attempt to address?