Pursuing Knowledge Consistency: Supervised Hierarchical Contrastive Learning for Facial Action Unit Recognition

Yingjie Chen,Chong Chen,Xiao Luo,Jianqiang Huang,Xian-Sheng Hua,Tao Wang,Yun Liang
DOI: https://doi.org/10.1145/3503161.3548116
2022-01-01
Abstract:With the increasing need for emotion analysis, facial action unit (AU) recognition has attracted much more attention as a fundamental task for affective computing. Although deep learning has boosted the performance of AU recognition to a new level in recent years, it remains challenging to extract subject-consistent representations since the appearance changes caused by AUs are subtle and ambiguous among subjects. We observe that there are three kinds of inherent relations among AUs, which can be treated as strong prior knowledge, and pursuing the consistency of such knowledge is the key to learning subject-consistent representations. To this end, we propose a supervised hierarchical contrastive learning method (SupHCL) for AU recognition to pursue knowledge consistency among different facial images and different AUs, which is orthogonal to methods focusing on network architecture design. Specifically, SupHCL contains three relation consistency modules, i.e., unary, binary, and multivariate relation consistency modules, which take the corresponding kind of inherent relations as extra supervision to encourage knowledge-consistent distributions of both AU-level and image-level representations. Experiments conducted on two commonly used AU benchmark datasets, BP4D and DISFA, demonstrate the effectiveness of each relation consistency module and the superiority of SupHCL.
What problem does this paper attempt to address?