Abstract:Facial expression recognition in the wild is challenging due to various un-constrained conditions. Although existing facial expression classifiers have been almost perfect on analyzing constrained frontal faces, they fail to perform well on partially occluded faces that are common in the wild. In this paper, we propose a Convolution Neutral Network with attention mechanism (ACNN) that can perceive the occlusion regions of the face and focus on the most discriminative unoccluded regions. ACNN is an end to end learning framework. It combines the multiple representations from facial regions of interest (ROIs). Each representation is weighed via a proposed Gate Unit that computes an adaptive weight from the region itself according to the unobstructed-ness and importance. Considering different RoIs, we introduce two versions of ACNN: patch based ACNN (pACNN) and global-local based ACNN (gACNN). pACNN only pays attention to local facial patches. gACNN integrates local representations at patch-level with global representation at image-level. The proposed ACNNs are evaluated on both real and synthetic occlusions, including a self-collected facial expression dataset with real-world occlusions (FED-RO), two largest in-the-wild facial expression datasets (RAF-DB and AffectNet) and their modifications with synthesized facial occlusions. Experimental results show that ACNNs improve the recognition accuracy on both the non-occluded faces and occluded faces. Visualization results demonstrate that, compared with the CNN without Gate Unit, ACNNs are capable of shifting the attention from the occluded patches to other related but unobstructed ones. ACNNs also outperform other state-of-the-art methods on several widely used in-the-lab facial expression datasets under the cross-dataset evaluation protocol.

Interrelated Fusion CNN with Statistical Grouping among Multipatches for Occluded Facial Expression Recognition

Occlusion Aware Facial Expression Recognition Using CNN With Attention Mechanism

An Improved SimAM Based CNN for Facial Expression Recognition

A convolution-transformer dual branch network for head-pose and occlusion facial expression recognition

Patch Attention Layer of Embedding Handcrafted Features in CNN for Facial Expression Recognition

Enhanced Hybrid Vision Transformer with Multi-Scale Feature Integration and Patch Dropping for Facial Expression Recognition

The Facial Expression Recognition Method Based on Image Fusion and CNN

Facial expression recognition in facial occlusion scenarios: A path selection multi-network

Facial Expression Recognition under Partial Occlusion Based on Fusion of Global and Local Features

Accurate and robust facial expressions recognition by fusing multiple sparse representation based classifiers.

Hybrid heuristic mechanism for occlusion aware facial expression recognition scheme using patch based adaptive CNN with attention mechanism

TriCAFFNet: A Tri-Cross-Attention Transformer with a Multi-Feature Fusion Network for Facial Expression Recognition

Facial Expression Recognition Methods in the Wild Based on Fusion Feature of Attention Mechanism and LBP

Facial Expression Recognition with Visual Transformers and Attentional Selective Fusion

Research on facial expression recognition based on Multimodal data fusion and neural network

Collaborative Attention Transformer on Facial Expression Recognition under Partial Occlusion

FERMixNet: an Occlusion Robust Facial Expression Recognition Model with Facial Mixing Augmentation and Mid-Level Representation Learning

Uncertain and Biased Facial Expression Recognition Based on Depthwise Separable Convolutional Neural Network with Embedded Attention Mechanism

Patch Attention Network for Video Facial Expression Recognition.

Multimodal 2D+3D Facial Expression Recognition with Deep Fusion Convolutional Neural Network