Learning Dynamics for Video Facial Expression Recognition

Yu Wang,Xinmin Xu,Yao Zhuang
DOI: https://doi.org/10.1145/3508546.3508581
2021-01-01
Abstract:Video-based facial expression recognition has always been a focus of attention in the computer vision community for decades. It aims to automatically identify one of several emotions represented by the video according to the input audio or visual information. Capturing the dynamics, namely motion pattern plays an important role in video-based facial expression recognition. In this paper, we explore an effective and efficient motion pattern to model temporal relationships called Diff-based Canny Operator (DCO) to guide the inter-frame aggregation and generate a novel feature modality. Our proposed DCO add rare computational consumption and can be easily inserted into any frameworks, so we incorporate it with exist networks to form a unified structure for video-based facial expression recognition task, which enable the network to Ideally extract temporal information. With extensive experiments on CK+ and AFEW dataset, our proposed method shows its superiority with better or comparable performance compared to the state-of-the-art approaches at low FLOPs.
What problem does this paper attempt to address?