Abstract:As an emerging research topic for proximity service (ProSe), automatic emotion recognition enables the machines to understand the emotional changes of human beings which can not only facilitate natural, effective, seamless, and advanced human–robot interaction or human–computer interface but also promote emotional health. Facial expression recognition (FER) is a vital task for emotion recognition. However, significant gap between human and machine exists in FER task. In this paper, we present a conditional generative adversarial network-based approach to alleviate the intra-class variations by individually controlling the facial expressions and learning the generative and discriminative representations simultaneously. The proposed framework consists of a generator G and three discriminators ( Di , Da , and Dexp ). The generator G transforms any query face image into another prototypic facial expression image with other factors preserved. Armed with action units condition, the generator G pays more attention to information relevant to facial expression. Three loss functions ( $L_{I}$ , La , and Lexp ) corresponding to the three discriminators ( Di , Da , and Dexp ) were designed to learn generative and discriminative representations. Moreover, after rendering the generated expression back to its original facial expression, cycle consistency loss is also applied to guarantee the identity and produce more constrained visual representations. Optimized by combining both synthesis and classification loss functions, the learnt representation is explicitly disentangled from other variations such as identity, head pose, and illumination. Qualitative and quantitative experimental results demonstrate the proposed FER system is effective for expression recognition.

CFAN-SDA: Coarse-fine Aware Network with Static-Dynamic Adaptation for Facial Expression Recognition in Videos

SDNET: Lightweight Facial Expression Recognition For Sample Disequilibrium.

DR-FER: Discriminative and Robust Representation Learning for Facial Expression Recognition

Cgan Based Facial Expression Recognition for Human-Robot Interaction

SAANet: Siamese Action-Units Attention Network for Improving Dynamic Facial Expression Recognition

Automatic 4D Facial Expression Recognition via Collaborative Cross-domain Dynamic Image Network.

From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos

CF-DAN: Facial-expression recognition based on cross-fusion dual-attention network

Multi-Attention Module for Dynamic Facial Emotion Recognition

NR-DFERNet: Noise-Robust Network for Dynamic Facial Expression Recognition

Intensity-Aware Loss for Dynamic Facial Expression Recognition in the Wild

Video-based Facial Expression Recognition Using Graph Convolutional Networks.

A Dual-Direction Attention Mixed Feature Network for Facial Expression Recognition

Coarse-to-Fine Cascaded Networks with Smooth Predicting for Video Facial Expression Recognition

Clip-aware expressive feature learning for video-based facial expression recognition

FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs

Fine-Grained Temporal-Enhanced Transformer for Dynamic Facial Expression Recognition

Facial Expression Recognition With Visual Transformers and Attentional Selective Fusion

Freq-HD: an Interpretable Frequency-based High-Dynamics Affective Clip Selection Method for In-the-wild Facial Expression Recognition in Videos

Visual Scene-Aware Hybrid and Multi-Modal Feature Aggregation for Facial Expression Recognition

Spatio-Temporal Facial Expression Recognition Using Convolutional Neural Networks and Conditional Random Fields