Collaborative Attention Transformer on Facial Expression Recognition under Partial Occlusion

Yan Luo,Jie Shao,Runxia Yang
DOI: https://doi.org/10.1117/1.jei.31.2.023037
IF: 0.829
2022-01-01
Journal of Electronic Imaging
Abstract:Recent research on facial expression recognition (FER) in the wild shows challenges still remain. Different from laboratory-controlled expression in the past, images in the wild contain more uncertainties, such as different forms of face information occlusion, ambiguous facial images, noisy labels, and so on. Among them, real-world facial occlusion is the most general and crucial challenge for FER. In addition, because of the COVID-19 disease epidemic, people have to wear masks in public, which brings new challenges to FER tasks. Due to the recent success of the Transformer on numerous computer vision tasks, we propose a Collaborative Attention Transformer (CAT) network that first uses Cross-Shaped Window Transformer as the backbone for the FER task. Meanwhile, two attention modules are collaborated. Channel-Spatial Attention Module is designed to increase the attention of the network to global features. Moreover, Window Attention Gate is used to enhance the ability of the model to focus on local details. The proposed method is evaluated on two public in-the-wild facial expression datasets, RAF-DB and FERPlus, and the results demonstrate that our CAT performs superior to the state-of-the-art methods. (C) 2022 SPIE and IS&T
What problem does this paper attempt to address?