Facial Expression Recognition Based on Multi-modal Features for Videos in the Wild

Chuanhe Liu,Yuanyuan Deng,Liyu Meng,Wenqiang Jiang,Yuchen Liu,Xinjie Zhang,Tenggan Zhang,Xiaolong Liu
DOI: https://doi.org/10.1109/CVPRW59228.2023.00624
2023-06-01
Abstract:This paper presents our work to the Expression Classification Challenge of the 5th Affective Behavior Analysis in-the-wild (ABAW) Competition. In our method, the multi-modal features are extracted by several different pertained models, which are used to build different combinations to capture more effective emotion information. Specifically, we extracted efficient facial expression features using MAE encoder pre-trained with a large-scale face dataset. For these combinations of visual and audio modal features, we utilize two kinds of temporal encoders to explore the temporal contextual information in the data. In addition, we employ several ensemble strategies for different experimental set-tings to obtain the most accurate expression recognition results. Our system achieves the average F1 Score of 0.4072 on the test set of Aff-wild2 ranking 2nd, which proves the effectiveness of our method.
Computer Science
What problem does this paper attempt to address?