Micro-expression recognition based on contextual transformer networks

Jun Yang,Zilu Wu,Renbiao Wu
DOI: https://doi.org/10.1007/s00371-024-03443-x
IF: 2.835
2024-06-14
The Visual Computer
Abstract:Micro-expression is a kind of unintentional facial expression that can reflect the genuine emotion. As a contactless affective computing method, micro-expressions have received attention from psychology and computer vision experts. Due to the characteristics of micro-expressions such as short duration, low intensity, and sparse facial action units, it is a challenging task to learn effective features of micro-expressions from images or videos. To address the above problems, we propose a contextual transformer-based dual-path network. Firstly, the apex frame conveys the most emotional information expressed in micro-expression, so it is taken as the object for feature extraction. Then, a dual-path network is used as the basic feature extraction network to improve the ability of fine feature extraction and exploring new features. Secondly, the contextual transformer module is embedded to achieve the extraction and fusion of global and neighboring local information of micro-expressions. Finally, to solve the problem of sample imbalance, the focal loss function is adopted to improve the classification performance of difficult-to-classify samples by selectively weighting the loss values. The proposed micro-expression recognition model is validated by the leave-one-subject-out cross-validation. Extensive experiments have been conducted with SMIC, CASME II, SAMM, and the fused datasets. Comparing with the baseline method, the unweighted F1 score and unweighted average recall on the fused dataset are improved by 0.1542 and 0.1863, respectively. The experimental results show that the proposed model outperforms the other mainstream models.
computer science, software engineering
What problem does this paper attempt to address?