Graph-Based Multi-Modal Multi-View Fusion for Facial Action Unit Recognition

Jianrong Chen,Sujit Dey
DOI: https://doi.org/10.1109/access.2024.3401168
IF: 3.9
2024-05-24
IEEE Access
Abstract:Facial action unit (AU) detection is a crucial step in the field of affective computing and plays a crucial role in applications such as human-computer interaction, psychology, and social robotics. Despite recent advances in the field, the problem of facial AU detection remains challenging, in particular in real-world scenarios with diverse lighting conditions and head poses. This paper first presents a new, realistically challenging multi-modal and multi-view AU dataset, captured in a real-world vehicle environment. Then we introduce a novel graph-based multi-modal multi-view fusion framework, tailored for challenging environments such as those encountered in Advanced Driver-Assistance Systems (ADAS), which significantly enhances AU detection performance under these difficult conditions. Our fusion model showcases significant advancements over current single-modality methods, achieving a marked improvement in F1 scores across most AUs. Specifically, the fusion approach demonstrated a 9.0% improvement in overall average F1 scores over the best-performing single-modality model. The results validate that integrating multiple modalities and viewpoints substantially boosts the model's robustness and accuracy under diverse conditions, offering a meaningful advancement over the state-of-the-art.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?