3D Feature Extraction Network Based on Self-supervision for Micro-expression Spotting

Yuhan Wang,Xupeng Guo,Zhaoqiang Xia
DOI: https://doi.org/10.1109/icipmc62364.2024.10586609
2024-01-01
Abstract:Micro-expression is an important tool to analyze real human emotions. As the upstream task of micro-expression analysis, video spotting needs to obtain accurate video frame position. At present, it mainly relies on manual calibration by experts, which is not suitable for processing massive videos in real scenes. Due to the short duration and weak intensity of micro-expression, traditional manual feature extraction methods are difficult to capture the weak change of micro-expression, while deep learning based methods are not robust enough. Therefore, this paper proposes a self-supervised facial feature extraction network to constructs a more robust facial feature extractor through self-supervised methods to capture the weak change in micro-expression. Concretely,we split raw long video into clips for model training and introduce a pixel-level-based mask operation to improve the effect of the model reconstruction. Then we reconstruct the optical flow sequence and original face sequence through two 3D feature extraction networks with identical structure and different parameters, and optimize the parameters by self-supervision.The results show that the proposed model captures robust subtle facial change features and improves the accuracy of micro-expression spotting on two datasets CAS(ME) 2 and SAMM-LV.
What problem does this paper attempt to address?