Motion-Guided Dual-Camera Tracker for Endoscope Tracking and Motion Analysis in a Mechanical Gastric Simulator

Yuelin Zhang,Kim Yan,Chun Ping Lam,Chengyu Fang,Wenxuan Xie,Yufu Qiu,Raymond Shing-Yan Tang,Shing Shin Cheng
2024-09-17
Abstract:Flexible endoscope motion tracking and analysis in mechanical simulators have proven useful for endoscopy training. Common motion tracking methods based on electromagnetic tracker are however limited by their high cost and material susceptibility. In this work, the motion-guided dual-camera vision tracker is proposed to provide robust and accurate tracking of the endoscope tip's 3D position. The tracker addresses several unique challenges of tracking flexible endoscope tip inside a dynamic, life-sized mechanical simulator. To address the appearance variation and keep dual-camera tracking consistency, the cross-camera mutual template strategy (CMT) is proposed by introducing dynamic transient mutual templates. To alleviate large occlusion and light-induced distortion, the Mamba-based motion-guided prediction head (MMH) is presented to aggregate historical motion with visual tracking. The proposed tracker achieves superior performance against state-of-the-art vision trackers, achieving 42% and 72% improvements against the second-best method in average error and maximum error. Further motion analysis involving novice and expert endoscopists also shows that the tip 3D motion provided by the proposed tracker enables more reliable motion analysis and more substantial differentiation between different expertise levels, compared with other trackers.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper aims to solve the problem of motion tracking and analysis of endoscopes in mechanical stomach simulators. Specifically, the existing methods based on electromagnetic trackers (EMT) have limitations such as high cost and susceptibility to material interference, while visual tracking methods, although having the advantages of low cost and no need for complex settings, have not been applied to the tracking of flexible endoscope operations in mechanical simulators. Therefore, this paper proposes a binocular - camera - based motion - guided visual tracker to provide robust and accurate tracking of the 3D position of the endoscope tip. This tracker is specifically optimized for the unique challenges faced when tracking the tip of a flexible endoscope in a dynamic, full - size mechanical stomach simulator, such as appearance changes, large - area occlusions, and image distortion caused by light. To address these challenges, the paper proposes two key technologies: 1. **Cross - Camera Mutual Template Strategy (CMT)**: By introducing a dynamic instantaneous mutual template, CMT can utilize the mutual information of the coupled binocular cameras between synchronous frames, thereby simplifying the tracking of target appearance fluctuations and improving the consistency of binocular tracking. 2. **Mamba - based Motion - Guided Prediction Head (MMH)**: MMH enhances the robustness of tracking by integrating historical motion information, especially in cases where the target disappears or the image is severely distorted. Experimental results show that this tracker improves by 42% and 72% respectively in terms of average error and maximum error compared to the second - best method, and shows higher reliability and discrimination in motion analysis among operators with different skill levels. In addition, this study also plans to release the code and data set after the paper is accepted, so that other researchers can reproduce and extend this work.