Heart Rate Estimation From Facial Videos Using a Spatiotemporal Representation With Convolutional Neural Networks

Rencheng Song,Senle Zhang,Chang Li,Yunfei Zhang,Juan Cheng,Xun Chen
DOI: https://doi.org/10.1109/TIM.2020.2984168
IF: 5.6
2020-01-01
IEEE Transactions on Instrumentation and Measurement
Abstract:Remote photoplethysmography (rPPG) is a kind of noncontact technique to measure heart rate (HR) from facial videos. As the demand for long-term health monitoring grows, rPPG attracts much attention from researchers. However, the performance of conventional rPPG methods is easily degenerated due to noise interference. Recently, some deep learning-based rPPG methods have been introduced and they revealed good performance against noise. In this article, we propose a new rPPG method with convolutional neural networks (CNNs) to build a mapping between a spatiotemporal HR feature image to its corresponding HR value. The feature map is constructed in a time-delayed way with noise-contaminated pulse signals extracted from existing rPPG methods. The CNN model is trained using transfer learning where images built from synthetic rPPG signals are taken to train the model first in order to generate initials for the practical one. The synthetic rPPG signals are interpolated from blood volume pulses or electrocardiograms through a modified Akima cubic Hermite interpolation. The proposed method is tested in both within-database and cross-database configurations on public databases. The results demonstrate that our method achieves overall the best performance compared to some other typical rPPG methods. The mean absolute error reaches 5.98 beats per minute and the mean error rate percentage is 7.97% in the cross-database testing on MAHNOB-HCI data set. Besides, some key factors that affect the performance of our method are also discussed which indicates potential ways for further improvements.
What problem does this paper attempt to address?