Abstract:In the billions of faces that are shaped by thousands of different cultures and ethnicities, one thing remains universal: the way emotions are expressed. To take the next step in human-machine interactions, a machine (e.g., a humanoid robot) must be able to clarify facial emotions. Allowing systems to recognize micro-expressions affords the machine a deeper dive into a person's true feelings, which will take human emotion into account while making optimal decisions. For instance, these machines will be able to detect dangerous situations, alert caregivers to challenges, and provide appropriate responses. Micro-expressions are involuntary and transient facial expressions capable of revealing genuine emotions. We propose a new hybrid neural network (NN) model capable of micro-expression recognition in real-time applications. Several NN models are first compared in this study. Then, a hybrid NN model is created by combining a convolutional neural network (CNN), a recurrent neural network (RNN, e.g., long short-term memory (LSTM)), and a vision transformer. The CNN can extract spatial features (within a neighborhood of an image), whereas the LSTM can summarize temporal features. In addition, a transformer with an attention mechanism can capture sparse spatial relations residing in an image or between frames in a video clip. The inputs of the model are short facial videos, while the outputs are the micro-expressions recognized from the videos. The NN models are trained and tested with publicly available facial micro-expression datasets to recognize different micro-expressions (e.g., happiness, fear, anger, surprise, disgust, sadness). Score fusion and improvement metrics are also presented in our experiments. The results of our proposed models are compared with that of literature-reported methods tested on the same datasets. The proposed hybrid model performs the best, where score fusion can dramatically increase recognition performance.

Facial Micro-Motion-Aware Mixup for Micro-Expression Recognition

Facial micro-expression recognition based on motion magnification network and graph attention mechanism

Micro-Expression Recognition by Motion Feature Extraction based on Pre-training

A Boost in Revealing Subtle Facial Expressions: A Consolidated Eulerian Framework

MMNet: Muscle motion-guided network for micro-expression recognition

Facial Prior Based First Order Motion Model for Micro-expression Generation

Boosting Micro-Expression Recognition Via Self-Expression Reconstruction and Memory Contrastive Learning

From Macro to Micro: Boosting micro-expression recognition via pre-training on macro-expression videos

MixAugment & Mixup: Augmentation Methods for Facial Expression Recognition

Facial Prior Guided Micro-Expression Generation.

FERMixNet: an Occlusion Robust Facial Expression Recognition Model with Facial Mixing Augmentation and Mid-Level Representation Learning

AU-assisted Graph Attention Convolutional Network for Micro-Expression Recognition

AST+SVMNet: A Novel Decomposition Method for Micro-Expression Recognition Based on Fusion Attention and Improved Spatio- Temporal Convolution by Feature Transfer

ESTME: Event-driven Spatio-temporal Motion Enhancement for Micro-Expression Recognition

Micro-expression Video Clip Synthesis Method Based on Spatial-temporal Statistical Model and Motion Intensity Evaluation Function

Facial Micro-Expression Recognition Enhanced by Score Fusion and a Hybrid Model from Convolutional LSTM and Vision Transformer

Aggregation of Motion Features of Multiple Paths for Micro-Expression Recognition

MixCut:A Data Augmentation Method for Facial Expression Recognition

Micro-Expression Recognition Based on Multi-task Learning and Resnet18

Geometric Graph Representation with Learnable Graph Structure and Adaptive AU Constraint for Micro-Expression Recognition

Objective Class-based Micro-Expression Recognition through Simultaneous Action Unit Detection and Feature Aggregation