Learning Effective Global Receptive Field for Facial Expression Recognition

Jiayi Han,Ang Li,Donghong Han,Jianfeng Feng
DOI: https://doi.org/10.1109/FG57933.2023.10042628
2023-01-01
Abstract:Facial expression recognition (FER) remains a challenging task despite years of effort because of the variations in view angles and human poses and the exclusion of expression-relevant facial parts. In this work, we propose to learn effective Global receptive field and Class-sensitive metrics for FER, namely GCNet which contains a Class-sensitive metric learning module (CSMLM) and mobile dilation modules (MDMs). CSMLM fully takes advantage of the variation in human faces to extract class-sensitive and spatially consistent features to improve the effectiveness of FER. MDM utilizes cascaded dilation convolution layers to achieve a global receptive field. However, directly adding a dilation convolution layer to a given sequence of convolution layers may face the gridding problem, which leads to sparse feature maps. In this work, we find the upper bound of the dilation rate of the additional convolution layer that avoids the gridding problem. Experiments show that the proposed approach reaches state-of-the-art (SOTA) performance on the RAF-DB, FER-Plus, and SFEW2.0 datasets.
What problem does this paper attempt to address?