ACTNet: Attention based CNN and Transformer network for respiratory rate estimation

Huahua Chen,Xiang Zhang,Zongheng Guo,Na Ying,Meng Yang,Chunsheng Guo
DOI: https://doi.org/10.1016/j.bspc.2024.106497
IF: 5.1
2024-06-02
Biomedical Signal Processing and Control
Abstract:The estimation of respiratory rate (RR) is an extremely crucial step in the monitoring of a patient's health care. Nowadays, non-contact physiological signal measurement based on face videos has great potential for various applications. However, current RR estimation methods still face challenges in solving remote RR estimation tasks because the color change of facial skin is very subtle and the quasi-periodicity of the remote photoplethysmography(rPPG) requires the long-range temporal modeling. In this paper, we propose a dual-branch network that consists of an attention based convolutional neural network (CNN) and Transformer network (ACTNet), which can fully utilize local features and global information. In ACTNet, the CNN branch captures subtle color changes between facial image frames and introduces local attention to further highlight local features, and the Transformer branch models the long-term temporal relationships between video frame sequences to obtain global information. Then we fuse local features and global information into the network using the feature coupling unit. Comprehensive experimental results on COHFACE and DEAP datasets show that our method achieves state-of-the-art performance.
engineering, biomedical
What problem does this paper attempt to address?