ResMFuse-Net: Residual-based multilevel fused network with spatial–temporal features for hand hygiene monitoring

Sohaib Asif,Xinyi Xu,Ming Zhao,Xuehan Chen,Fengxiao Tang,Yusen Zhu
DOI: https://doi.org/10.1007/s10489-024-05305-4
IF: 5.3
2024-03-03
Applied Intelligence
Abstract:The automation of hand hygiene monitoring is critical in healthcare for ensuring clean hands and preventing infectious disease spread. While advancements have been made, existing methods have limitations in accurately detecting and classifying handwashing actions. This paper addresses these limitations and introduces the Residual-Based Multilevel Fused Network (ResMFuse-Net) as a novel approach to automate the quality assurance of hand hygiene procedures. Our model integrates advanced techniques, including feature fusion, model compression, a feature fusion block (FFB), and a modified separable residual block (SE-ResB). The proposed model fused two networks into one trainable feature extraction pipeline, and applies model compression to retain the core blocks that are crucial for propagating strong and robust features while conserving a significant fraction of the computing resources. Additionally, we introduce a FFB that includes ConvLSTM and alpha dropout to learn spatial dependencies, establish correlations between frames in a video, and mitigate overfitting. This paper introduces a SE-ResB, which is a customized residual component composed of separable convolutions and LeakyReLU activation. The SE-ResB is incorporated to handle the fused features and generate a more diverse set of features, leading to considerable performance enhancements. This study also includes an ablation analysis that highlights the importance of each component. The proposed ResMFuse-Net is evaluated on two datasets: a newly created handwashing dataset (451 videos) and a publicly available dataset (656 videos). Achieving a recognition accuracy of 97.61% on the handwashing dataset and 98.69% on the other dataset, the ResMFuse-Net outperforms previous methods with fewer parameters and FLOPs, demonstrating its efficiency and cost-effectiveness.
computer science, artificial intelligence
What problem does this paper attempt to address?