A survey on Graph Deep Representation Learning for Facial Expression Recognition

Théo Gueuret,Akrem Sellami,Chaabane Djeraba
2024-11-13
Abstract:This comprehensive review delves deeply into the various methodologies applied to facial expression recognition (FER) through the lens of graph representation learning (GRL). Initially, we introduce the task of FER and the concepts of graph representation and GRL. Afterward, we discuss some of the most prevalent and valuable databases for this task. We explore promising approaches for graph representation in FER, including graph diffusion, spatio-temporal graphs, and multi-stream architectures. Finally, we identify future research opportunities and provide concluding remarks.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problems that this paper attempts to solve are the challenges in Facial Expression Recognition (FER), especially improving the accuracy and robustness of FER through Graph Representation Learning (GRL). Specifically, the paper focuses on the following key issues: 1. **Facial Recognition in Complex Environments**: - Facial expression recognition faces many challenges in complex environments (such as illumination changes, occlusions, head pose changes, etc.). Although traditional deep - learning methods (such as Convolutional Neural Networks, CNN) have made certain progress, they still have limitations when dealing with these complex situations. 2. **Efficient Encoding of Facial Features**: - How to effectively encode and represent facial features is an important issue. Traditional grid - based methods are insufficient in capturing facial structures and dynamic changes. Graph representation learning provides a new idea and can better model the relationships between facial features. 3. **Choice between Discrete and Continuous Emotion Representations**: - Researchers need to choose between discrete emotion categories (such as happy, sad, etc.) and continuous emotion models. Discrete categories provide accurate classification but may overlook some subtle emotional changes; continuous models can capture more details, but the classification accuracy may decrease. 4. **Cross - Domain Data Differences**: - There are large differences in data distribution among different FER datasets, resulting in poor generalization ability of models on different datasets. How to construct a model that can perform well on different datasets is an urgent problem to be solved. 5. **Modeling of Spatio - Temporal Information**: - Dynamic facial expression recognition needs to consider information in both spatial and temporal dimensions simultaneously. How to effectively model these spatio - temporal information to improve the accuracy and robustness of recognition is also an important research direction. ### Specific Problem Summary - **How to deal with the challenges of facial recognition in complex environments?** - **How to more efficiently encode facial features through graph representation learning?** - **How to find a balance between discrete and continuous emotion representations?** - **How to solve the problem of cross - domain data differences?** - **How to effectively model spatio - temporal information to improve the performance of dynamic facial expression recognition?** These problems reflect the difficulties in the current field of facial expression recognition and point out the direction for future research. By introducing graph representation learning, the paper aims to explore new methods to overcome these challenges and thus promote the development of FER technology.