Improving Multi-Label Facial Expression Recognition with Consistent and Distinct Attentions

Jing Jiang,Weihong Deng
DOI: https://doi.org/10.1109/taffc.2023.3333874
IF: 13.99
2024-01-01
IEEE Transactions on Affective Computing
Abstract:Facial expression recognition (FER) attracts much attention in computer vision. Previous works mostly study the single-label FER problem. The more complex multi-label facial expression recognition task is underexplored. Multi-label FER is more challenging than single-label task due to two primary causes. On one hand, there are less available multi-label facial expression data for analysis. On the other hand, the entanglement of expressions makes recognition more difficult. In this work, we leverage class activation map (CAM) to improve the performance of multi-label FER. Firstly, considering the shortage of training data, an attention flipping consistency (AFC) loss equipped with random rotation augmentation is proposed. It restrains the network to produce consistent CAMs under horizontally flipping transformation and thus improves the stability of network without any extra data. Secondly, based on the prior knowledge that different emotions have different predominant activated facial regions, we propose a label-guided spatial attention dispersing (SAD) loss to enable the model to learn from distinct expression-related regions. By combining the widely used multi-label classification loss (i.e., binary cross-entropy loss) and proposed AFC loss and SAD loss, our method achieves state-of-the-art performance on multi-label FER databases and the model's interpretability is improved.
What problem does this paper attempt to address?