CPR-CLIP: Multimodal Pre-Training for Composite Error Recognition in CPR Training

Shunli Wang,Dingkang Yang,Peng Zhai,Lihua Zhang
DOI: https://doi.org/10.1109/lsp.2023.3346207
2024-01-01
IEEE Signal Processing Letters
Abstract:The expensive cost of the medical skill training paradigm hinders the development of medical education, which has attracted widespread attention in the intelligent signal processing community. To address the issue of composite error action recognition in Cardiopulmonary Resuscitation (CPR) training, this letter proposes a multimodal pre-training framework named CPR-CLIP based on prompt engineering. Specifically, we design three prompts to fuse multiple errors naturally on the semantic level and then align linguistic and visual features via the contrastive pre-training loss. Extensive experiments verify the effectiveness of the CPR-CLIP. Ultimately, the CPR-CLIP is encapsulated to an electronic assistant, and four doctors are recruited for evaluation. Nearly four times efficiency improvement is observed in comparative experiments, which demonstrates the practicality of the system. We hope this work brings new insights to the intelligent medical skill training and signal processing communities simultaneously.
What problem does this paper attempt to address?