Abstract:In everyday life, dynamic facial expressions are merely continuous human responses to external events. However, in human-computer interaction, rapidly recognizing changes in facial expressions from video streams is a relatively complex process. This complexity renders Dynamic Facial Expression Recognition (DFER) a critical research task in the domains of computer vision and image processing. This paper analyses the correlations and contrasts between static and dynamic facial expression research, highlighting key issues in the study of dynamic facial expressions, such as dynamic feature extraction and frame extraction. After that, it enumerates significant algorithms in both traditional models and deep learning models, providing an analysis of the advantages and disadvantages of these two major approaches. At the same time, it investigates the reasons behind the transition of research models for DFER from traditional methods to deep learning approaches. The paper focuses on two notable models from each approach: Histogram of Oriented Gradient (HOG) for processing raw images, Support Vector Machine (SVM) for data classification in traditional models. Convolutional Neural Network (CNN) for spatial feature extraction and Long Short-Term Memory (LSTM) for temporal feature extraction in deep learning models. These models are discussed in detail concerning their strengths and weaknesses, operational processes, and performance outcomes. In the concluding section, the author summarizes the main factors influencing research in this field and the current challenges encountered. By focusing on future research directions, the paper also presents a review of recent methodologies and offers insightful research directions for further investigation.

Learning Dynamics for Video Facial Expression Recognition

Joint Structured Sparsity Regularized Multiview Dimension Reduction for Video-Based Facial Expression Recognition.

Coarse-to-Fine Cascaded Networks with Smooth Predicting for Video Facial Expression Recognition

Automatic Analysis of Facial Expressions Based on Deep Covariance Trajectories

Video-Based Facial Expression Recognition Using Histogram Sequence of Local Gabor Binary Patterns from Three Orthogonal Planes

Rethinking the Learning Paradigm for Dynamic Facial Expression Recognition

The investigation of traditional models and machine learning models in dynamic facial expression recognition

SAANet: Siamese Action-Units Attention Network for Improving Dynamic Facial Expression Recognition

Combining 2D Gabor and Local Binary Pattern for Facial Expression Recognition Using Extreme Learning Machine

Observe Finer to Select Better: Learning Key Frame Extraction Via Semantic Coherence for Dynamic Facial Expression Recognition in the Wild

Learning facial expression and body gesture visual information for video emotion recognition

Video-driven state-aware facial animation

Video-Based Facial Animation with Detailed Appearance Texture

Learning Expression Features via Deep Residual Attention Networks for Facial Expression Recognition From Video Sequences

Displaced Dynamic Expression Regression for Real-Time Facial Tracking and Animation

DPCNet: Dual Path Multi-Excitation Collaborative Network for Facial Expression Representation Learning in Videos

Nonlinear dynamical system iteration applied in video face feature extraction and recognition

Video-based Facial Expression Recognition Using Graph Convolutional Networks.

Dual-STI: Dual-path spatial-temporal interaction learning for dynamic facial expression recognition

Automatic 4D Facial Expression Recognition Using Dynamic Geometrical Image Network

Learning Face Expression Features from Video Using Spatio-Temporal Feature Extractor and CNN-LSTM