Abstract:Unlike the conventional facial expressions, micro-expressions are involuntary and transient facial expressions capable of revealing the genuine emotions that people attempt to hide. Therefore, they can provide important information in a broad range of applications such as lie detection, criminal detection, etc. Since micro-expressions are transient and of low intensity, however, their detection and recognition is difficult and relies heavily on expert experiences. Due to its intrinsic particularity and complexity, video-based micro-expression analysis is attractive but challenging, and has recently become an active area of research. Although there have been numerous developments in this area, thus far there has been no comprehensive survey that provides researchers with a systematic overview of these developments with a unified evaluation. Accordingly, in this survey paper, we first highlight the key differences between macro- and micro-expressions, then use these differences to guide our research survey of video-based micro-expression analysis in a cascaded structure, encompassing the neuropsychological basis, datasets, features, spotting algorithms, recognition algorithms, applications and evaluation of state-of-the-art approaches. For each aspect, the basic techniques, advanced developments and major challenges are addressed and discussed. Furthermore, after considering the limitations of existing micro-expression datasets, we present and release a new dataset — called micro-and-macro expression warehouse (MMEW) — containing more video samples and more labeled emotion types. We then perform a unified comparison of representative methods on CAS(ME)<span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.054ex" height="2.343ex" style="vertical-align: -0.171ex;" viewBox="0 -934.9 453.9 1008.6" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use transform="scale(0.707)" xlink:href="#MJMAIN-32" x="0" y="513"></use></g></svg></span>2 for spotting, and on MMEW and SAMM for -ecognition, respectively. Finally, some potential future research directions are explored and outlined.<svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><defs id="MathJax_SVG_glyphs"><path stroke-width="1" id="MJMAIN-32" d="M109 429Q82 429 66 447T50 491Q50 562 103 614T235 666Q326 666 387 610T449 465Q449 422 429 383T381 315T301 241Q265 210 201 149L142 93L218 92Q375 92 385 97Q392 99 409 186V189H449V186Q448 183 436 95T421 3V0H50V19V31Q50 38 56 46T86 81Q115 113 136 137Q145 147 170 174T204 211T233 244T261 278T284 308T305 340T320 369T333 401T340 431T343 464Q343 527 309 573T212 619Q179 619 154 602T119 569T109 550Q109 549 114 549Q132 549 151 535T170 489Q170 464 154 447T109 429Z"></path></defs></svg>

Micro-expression Spotting with Multi-scale Local Transformer in Long Videos

SpotFormer: Multi-Scale Spatio-Temporal Transformer for Facial Expression Spotting

Synergistic Spotting and Recognition of Micro-Expression via Temporal State Transition

MESNet: A Convolutional Neural Network for Spotting Multi-Scale Micro-Expression Intervals in Long Videos

Facial Micro-Expression Recognition Based on Multi-Scale Temporal and Spatial Features

Spotting Micro-Expressions on Long Videos Sequences

Integrating VideoMAE based model and Optical Flow for Micro- and Macro-expression Spotting

Micro-expression recognition based on contextual transformer networks

Micro-expression spotting with a novel wavelet convolution magnification network in long videos

3D-CNN for Facial Micro- and Macro-expression Spotting on Long Video Sequences using Temporal Oriented Reference Frame

Transfer Spatio-Temporal Knowledge from Emotion-Related Tasks for Facial Expression Spotting.

Micro-expression Recognition with Small Sample Size by Transferring Long-Term Convolutional Neural Network

Two-Level Spatio-Temporal Feature Fused Two-Stream Network for Micro-Expression Recognition

MCCA-VNet: A Vit-Based Deep Learning Approach for Micro-Expression Recognition Based on Facial Coding

Micro-expression Recognition Using Dynamic Textures on Tensor Independent Color Space

Micro-expression recognition based on multi-scale 3D residual convolutional neural network

Micro-Expression Spotting Based on a Short-Duration Prior and Multi-Stage Feature Extraction

Micro-Expression Recognition by Aggregating Local Spatio-Temporal Patterns.

PESFormer: Boosting Macro- and Micro-expression Spotting with Direct Timestamp Encoding

Dual-Branch Cross-Attention Network for Micro-Expression Recognition with Transformer Variants

Video-based Facial Micro-Expression Analysis: A Survey of Datasets, Features and Algorithms