Abstract:Facial action unit (AU) detection is challenging due to the difficulty in capturing correlated information from subtle and dynamic AUs. Existing methods often resort to the localization of correlated regions of AUs, in which predefining local AU attentions by correlated facial landmarks often discards essential parts, or learning global attention maps often contains irrelevant areas. Furthermore, existing relational reasoning methods often employ common patterns for all AUs while ignoring the specific way of each AU. To tackle these limitations, we propose a novel adaptive attention and relation (AAR) framework for facial AU detection. Specifically, we propose an adaptive attention regression network to regress the global attention map of each AU under the constraint of attention predefinition and the guidance of AU detection, which is beneficial for capturing both specified dependencies by landmarks in strongly correlated regions and facial globally distributed dependencies in weakly correlated regions. Moreover, considering the diversity and dynamics of AUs, we propose an adaptive spatio-temporal graph convolutional network to simultaneously reason the independent pattern of each AU, the inter-dependencies among AUs, as well as the temporal dependencies. Extensive experiments show that our approach (i) achieves competitive performance on challenging benchmarks including BP4D, DISFA, and GFT in constrained scenarios and Aff-Wild2 in unconstrained scenarios, and (ii) can precisely learn the regional correlation distribution of each AU.

Action Unit Detection with Joint Adaptive Attention and Graph Relation

Facial Action Unit Detection Via Adaptive Attention and Relation.

Facial Action Unit Detection Using Attention and Relation Learning

Local Region Perception and Relationship Learning Combined with Feature Fusion for Facial Action Unit Detection

Spatio-Temporal AU Relational Graph Representation Learning For Facial Action Units Detection

Weakly-Supervised Attention and Relation Learning for Facial Action Unit Detection

An Attention-based Method for Action Unit Detection at the 3rd ABAW Competition

Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment

Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection

Facial Action Units Detection Aided by Global-Local Expression Embedding

Facial Action Unit Detection by Exploring the Weak Relationships Between AU Labels

Learning facial expression-aware global-to-local representation for robust action unit detection

Facial Action Unit Recognition Based on Self-Attention Spatiotemporal Fusion

Region And Temporal Dependency Fusion For Multi-Label Action Unit Detection

JÂA-Net: Joint Facial Action Unit Detection and Face Alignment Via Adaptive Attention

Facial Action Unit Recognition by Prior and Adaptive Attention

Action Unit Detection by Exploiting Spatial-Temporal and Label-Wise Attention with Transformer

Data-aware relation learning-based graph convolution neural network for facial action unit recognition

Attention Based Relation Network for Facial Action Units Recognition

MMA-Net: Multi-view Mixed Attention Mechanism for Facial Action Unit Detection

View-Independent Facial Action Unit Detection.