High-level LoRA and Hierarchical Fusion for Enhanced Micro-Expression Recognition

Zhiwen Shao,Yifan Cheng,Yong Zhou,Xiang,Jian Li,Bing Liu,Dit-Yan Yeung
DOI: https://doi.org/10.1007/s00371-024-03676-w
IF: 2.835
2024-01-01
The Visual Computer
Abstract:Micro-expression recognition (MER) remains challenging due to its subtle and fleeting nature. Existing methods often suffer from insufficient training data or rely on handcrafted features. Inspired by recent advancements in large language model fine-tuning and visual foundation models (VFMs), we propose HLoRA-MER, a novel framework that combines high-level low-rank adaptation (HLoRA) and a hierarchical fusion module (HFM). HLoRA fine-tunes the high-level layers of a VFM to capture facial muscle movement information, while HFM aggregates inter-frame and spatio-temporal features. Experiments on benchmark datasets demonstrate that HLoRA-MER outperforms state-of-the-art methods, achieving an F1-score of 84.24% and 83.07% on CASME II and SAMM, respectively, with only 197k trainable parameters. Our approach offers a promising solution for MER in both constrained and unconstrained scenarios. The code is available at https://github.com/CYF-cuber/HLoRA_MER_dinov2 .
What problem does this paper attempt to address?