VAD-Net: Multidimensional Emotion Recognition from Facial Expression Images

Yi Huo,Yun Ge
DOI: https://doi.org/10.1109/IJCNN60899.2024.10651071
2024-06-30
Abstract:Current FER (Facial Expression Recognition) dataset is mostly labeled by emotion categories, such as happy, angry, sad, fear, disgust, surprise, and neutral which are limited in expressiveness. However, future affective computing requires more comprehensive and precise emotion metrics which could be measured by VAD(Valence-Arousal-Dominance) multidimension parameters. To address this, AffectNet has tried to add VA (Valence and Arousal) information, but still lacks D(Dominance). Thus, the research introduces VAD annotation on FER2013 dataset, takes the initiative to label D(Dominance) dimension. Then, to further improve VAD prediction accuracy, it enforces orthogonalized convolution on regression network to extract more diverse and expressive features. Experiment results show that D dimension could be measured but is difficult to obtain compared with V and A dimension, no matter in manual annotation or regression model prediction. Furthermore, the ablation test is carried out by introducing orthogonal convolution whose results verifies that better VAD prediction could be achieved under the configuration of orthogonalized convolution. Therefore, the research provides an initial annotation work for D(Dominance) dimension on FER dataset, and proposes a better regression network for VAD prediction through orthogonalized operation. The newly built VAD annotated FER2013 dataset could act as a benchmark to measure VAD multidimensional emotions, while the orthogonalized regression network could act as the baseline for VAD facial expression recognition. The newly labeled VAD dataset and prediction baseline code is publicly available on Github: https://github.com/YeeHoran/VAD-Net.
Computer Science
What problem does this paper attempt to address?