Abstract:Remote Photoplethysmography (rPPG) aims to measure physiological signals and Heart Rate (HR) from facial videos. Recent unsupervised rPPG estimation methods have shown promising potential in estimating rPPG signals from facial regions without relying on ground truth rPPG signals. However, these methods seem oblivious to interference existing in rPPG signals and still result in unsatisfactory performance. In this paper, we propose a novel De-interfered and Descriptive rPPG Estimation Network (DD-rPPGNet) to eliminate the interference within rPPG features for learning genuine rPPG signals. First, we investigate the characteristics of local spatial-temporal similarities of interference and design a novel unsupervised model to estimate the interference. Next, we propose an unsupervised de-interfered method to learn genuine rPPG signals with two stages. In the first stage, we estimate the initial rPPG signals by contrastive learning from both the training data and their augmented counterparts. In the second stage, we use the estimated interference features to derive de-interfered rPPG features and encourage the rPPG signals to be distinct from the interference. In addition, we propose an effective descriptive rPPG feature learning by developing a strong 3D Learnable Descriptive Convolution (3DLDC) to capture the subtle chrominance changes for enhancing rPPG estimation. Extensive experiments conducted on five rPPG benchmark datasets demonstrate that the proposed DD-rPPGNet outperforms previous unsupervised rPPG estimation methods and achieves competitive performances with state-of-the-art supervised rPPG methods.

What problem does this paper attempt to address?

The paper primarily addresses the issue of interference in remote photoplethysmography (rPPG) signal estimation and proposes a novel unsupervised method aimed at accurately extracting physiological signals from facial videos without relying on real rPPG signals as training data. Specifically, the paper points out that current unsupervised rPPG estimation methods, while capable of estimating rPPG signals from facial regions to some extent, often overlook the interference present in rPPG signals and perform poorly on datasets with challenging interference. To improve this situation, the research team designed a new model called "De-interference and Descriptive rPPG Estimation Network" (DD-rPPGNet). DD-rPPGNet consists of two main parts: 1. **Interference Estimation Branch**: Utilizes the characteristics of local spatiotemporal similarity to model and estimate interference signals. This part estimates interference features by analyzing signals from non-facial background regions. 2. **De-interference rPPG Estimation Branch**: First, it estimates preliminary rPPG signals from the original video and its enhanced versions through contrastive learning; then, it removes the interference components from the preliminary estimates using the interference features obtained in the first step to obtain pure rPPG signals. Additionally, the paper proposes a powerful 3D Learnable Descriptive Convolution (3DLDC) to capture subtle chromatic changes on the skin, thereby enhancing the ability to estimate rPPG signals. Experimental results show that DD-rPPGNet not only outperforms existing unsupervised rPPG estimation methods on multiple public rPPG benchmark datasets but also achieves performance comparable to state-of-the-art supervised rPPG estimation methods. In summary, the paper aims to address the issues faced by current unsupervised rPPG estimation methods when processing facial videos with challenging interference by proposing a new framework to improve the accuracy and robustness of rPPG signal estimation.

DD-rPPGNet: De-interfering and Descriptive Feature Learning for Unsupervised rPPG Estimation

Robust Remote Photoplethysmography Estimation With Environmental Noise Disentanglement

Deep Super-Resolution Network for rPPG Information Recovery and Noncontact Heart Rate Estimation

DSE-NN: Deeply Supervised Efficient Neural Network for Real-Time Remote Photoplethysmography

ST-Phys: Unsupervised Spatio-Temporal Contrastive Remote Physiological Measurement

Self-similarity Prior Distillation for Unsupervised Remote Physiological Measurement

DRNet: Decomposition and Reconstruction Network for Remote Physiological Measurement

TransPhys: Transformer-based unsupervised contrastive learning for remote heart rate measurement

Deep learning-based remote-photoplethysmography measurement from short-time facial video

rPPG-MAE: Self-supervised Pre-training with Masked Autoencoders for Remote Physiological Measurement

Remote Heart Rate Measurement from Highly Compressed Facial Videos: an End-to-end Deep Learning Solution with Video Enhancement

Heart Rate Estimation From Facial Videos Using a Spatiotemporal Representation With Convolutional Neural Networks

Standardized rPPG signal generation based on generative adversarial networks

Learning Motion-Robust Remote Photoplethysmography through Arbitrary Resolution Videos

Learning Spatio-Temporal Pulse Representation With Global-Local Interaction and Supervision for Remote Prediction of Heart Rate

Remote physiological signal recovery with efficient spatio-temporal modeling

A Compensation Network With Error Mapping for Robust Remote Photoplethysmography in Noise-Heavy Conditions

Contrast-Phys: Unsupervised Video-based Remote Physiological Measurement via Spatiotemporal Contrast

Facial Video-based Remote Physiological Measurement via Self-supervised Learning

ConDiff-rPPG: Robust Remote Physiological Measurement to Heterogeneous Occlusions

An image enhancement based method for improving rPPG extraction under low-light illumination