Abstract:Emotion recognition aims to interpret the emotional states of a person based on various inputs including audio, visual, and textual cues. This paper focuses on emotion recognition using visual features. To leverage the correlation between facial expression and the emotional state of a person, pioneering methods rely primarily on facial features. However, facial features are often unreliable in natural unconstrained scenarios, such as in crowded scenes, as the face lacks pixel resolution and contains artifacts due to occlusion and blur. To address this, in the wild emotion recognition exploits full-body person crops as well as the surrounding scene context. In a bid to use body pose for emotion recognition, such methods fail to realize the potential that facial expressions, when available, offer. Thus, the aim of this paper is two-fold. First, we demonstrate our method, PERI, to leverage both body pose and facial landmarks. We create part aware spatial (PAS) images by extracting key regions from the input image using a mask generated from both body pose and facial landmarks. This allows us to exploit body pose in addition to facial context whenever available. Second, to reason from the PAS images, we introduce context infusion (Cont-In) blocks. These blocks attend to part-specific information, and pass them onto the intermediate features of an emotion recognition network. Our approach is conceptually simple and can be applied to any existing emotion recognition method. We provide our results on the publicly available in the wild EMOTIC dataset. Compared to existing methods, PERI achieves superior performance and leads to significant improvements in the mAP of emotion categories, while decreasing Valence, Arousal and Dominance errors. Importantly, we observe that our method improves performance in both images with fully visible faces as well as in images with occluded or blurred faces.

FindingEmo: An Image Dataset for Emotion Recognition in the Wild

EmoSet: A Large-scale Visual Emotion Dataset with Rich Attributes

Establishing a Large Scale Dataset for Image Emotion Analysis Using Chinese Emotion Ontology

Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark

MovieEmotion-IMG: An Emotion Distribution Dataset of Movie Scene Images

Context Based Emotion Recognition Using EMOTIC Dataset

The PMEmo Dataset for Music Emotion Recognition

TamilEmo: Finegrained Emotion Detection Dataset for Tamil

EMO-KNOW: A Large Scale Dataset on Emotion and Emotion-cause

MIGMA: The Facial Emotion Image Dataset for Human Expression Recognition

Building and validation of a set of facial expression images to detect emotions: a transcultural study

Emo-FilM: A multimodal dataset for affective neuroscience using naturalistic stimuli

Emo3D: Metric and Benchmarking Dataset for 3D Facial Expression Generation from Emotion Description

Emognition dataset: emotion recognition with self-reports, facial expressions, and physiology using wearables

SEED-VII: A Multimodal Dataset of Six Basic Emotions with Continuous Labels for Emotion Recognition

ArPanEmo: An Open-Source Dataset for Fine-Grained Emotion Recognition in Arabic Online Content during COVID-19 Pandemic

It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection

ArmanEmo: A Persian Dataset for Text-based Emotion Detection

Design and Realization of Learning Emotion Database

PERI: Part Aware Emotion Recognition In The Wild

EmoInHindi: A Multi-label Emotion and Intensity Annotated Dataset in Hindi for Emotion Recognition in Dialogues