A novel one-stage framework for visual pulse rate estimation using deep neural networks
Bin Huang,Chun-Liang Lin,Weihai Chen,Chia-Feng Juang,Xingming Wu
DOI: https://doi.org/10.1016/j.bspc.2020.102387
IF: 5.1
2021-04-01
Biomedical Signal Processing and Control
Abstract:<p>Estimation of the visual pulse rate (also called heart rate) refers to extraction of the pulse rate from a facial video. With the studies on extracting photoplethysmography (PPG) signals from a facial video, the non-contacted measurement method has aroused great interest among researchers over the past few years. In this study, a novel one-stage spatio-temporal framework, namely <strong>PRnet</strong>, is proposed to estimate the pulse rate from a stationary facial video. First, visual pulse rate estimation is defined as a regression task based on deep neural networks, in which a video is mapped to a pulse rate value. Then, 3D convolutional neural networks (Conv3D) and Long short-term memory (LSTM) modules are used to extract spatial and latent temporal information that is hidden in a video. Subsequently, one fully connected layer is applied in the last layer of <strong>PRnet</strong> to estimate the pulse rate directly. Based on the exquisite framework design, our proposed method realizes competitive performance, especially in terms of processing latency, since it does not rely on power spectral density (PSD) and traditional Fast Fourier Transform (FFT) algorithms. Using our method, only 60 frames of video (2 s) are required for the robust prediction of the pulse rate, whereas 6–30 s of video are typically required for other methods. Finally, a novel visual pulse rate estimation database, which includes pulse rate range at various times of day, is collected to evaluate the proposed framework. The results of extensive experiments demonstrate that <strong>PRnet</strong> performs competitively while compared with state-of-the-art methods.</p>
engineering, biomedical