Channel-Wise Interactive Learning for Remote Heart Rate Estimation From Facial Video

Qi Li,Dan Guo,Wei Qian,Xilan Tian,Xiao Sun,Haifeng Zhao,Meng Wang
DOI: https://doi.org/10.1109/TCSVT.2023.3332408
2024-06-01
Abstract:Remote photoplethysmography measurement (also called rPPG prediction) is a vision-based technique that allows for the non-contact monitoring of human physiological activity using facial video. However, precisely detecting subtle color changes on facial skin, especially in less-constrained real-life scenarios, remains a formidable challenge for rPPG prediction. In this work, we address a rPPG-based heart rate estimation task by proposing an end-to-end Channel-wise Interaction Network (CIN-rPPG), in which the core idea contains two specialized units: channel-temporal interactive learning (CIT) and channel-spatial interactive learning (CIS). The CIT unit gets the periodicity of the rPPG signal by using temporal-wise shifting and channel-wise scaling to measure the interaction between channels and temporal dimensions. The CIS unit does both spatial-wise scaling and channel-wise scaling at the same time to perform channel-spatial interaction. This is intended to reveal how rPPG-related visual responses are detected on the human face. We exploit the rPPG recovery through the alternation of CIT and CIS implementations. The CIN-rPPG is completely conducted by convolutional operations on the sequential 2D feature maps of facial video in an end-to-end manner. Extensive experiments on three heart rate estimation datasets (UBFC-rPPG, PURE, and MMSE-HR) demonstrate that CIN-rPPG achieves state-of-the-art performance on both intra-dataset and cross-dataset testing.
Computer Science
What problem does this paper attempt to address?