An Effective Ensemble Learning Framework for Affective Behaviour Analysis
Wei Zhang,Feng Qiu,Chen Liu,Lincheng Li,Heming Du,Tianchen Guo,Xin Yu
DOI: https://doi.org/10.1109/cvprw63382.2024.00479
2024-01-01
Computer Vision and Pattern Recognition
Abstract:Affective Behavior Analysis aims to facilitate technology emotionally smart, creating a world where devices can understand and react to our emotions as humans do. To comprehensively evaluate the authenticity and applicability of emotional behavior analysis techniques in natural environments, the 6th competition on Affective Behavior Analysis in-the-wild (ABAW) utilizes the Aff-Wild2, Hume-Vidmimic2, and C-EXPR-DB datasets to set up five competitive tracks, i.e., Valence-Arousal (VA) Estimation, Expression (EXPR) Recognition, Action Unit (AU) Detection, Compound Expression (CE) Recognition, and Emotional Mimicry Intensity (EMI) Estimation. In this paper, we present our method designs for VA estimation, expression recognition, and AU detection tracks. Specifically, our framework mainly includes three aspects: 1) To achieve high-quality facial feature representations, we employ Masked-Auto Encoder as the visual features extraction model and fine-tune it with our facial dataset. 2) Utilizing a transformer-based feature fusion module to fully integrate emotional information provided by audio signals, visual images, and transcripts, offering high-quality expression features for the downstream tasks. 3) Considering the complexity of the video collection scenes, we conduct a more detailed dataset division based on scene characteristics and train the classifier for each scene. Extensive experiments demonstrate the superiority of our designs. Our work won the championship in the AU, EXPR, and VA tracks at the ABAW6 competition.