Urtnet: an unstructured feature fusion network for real-time detection of endoscopic surgical instruments

Guo, Jing,Lou, Haifang
DOI: https://doi.org/10.1007/s11554-024-01567-w
IF: 2.293
2024-11-05
Journal of Real-Time Image Processing
Abstract:Minimally invasive surgery (MIS) is increasingly popular due to its smaller incisions, less pain, and faster recovery. Despite its advantages, challenges like limited visibility and reduced tactile feedback can lead to instrument and organ damage, highlighting the need for precise instrument detection and identification. Current methods face difficulties in detecting multi-scale targets and are often disrupted by blurring, occlusion, and varying lighting conditions during surgeries. Addressing these challenges, this paper introduces URTNet, a novel unstructured feature fusion network designed for the real-time detection of multi-scale surgical instruments in complex environments. Initially, the paper proposes a Stair Aggregation Network (SAN) to efficiently merge multi-scale information, minimizing detail loss in feature fusion and improving detection of blurred and obscured targets. Subsequently, a Multi-scale Feature Weighted Fusion (MFWF) approach is presented to tackle significant scale variations in detection objects and reconstruct the detection layers based on target sizes within endoscopic views. The effectiveness of URTNet is validated through tests on the public laparoscopic dataset m2cai16-tool and another dataset from Sun Yat-sen University Cancer Center, where URTNet achieved average precision scores ( ) of 93.3% and 97.9%, surpassing other advanced methodologies.
computer science, artificial intelligence,engineering, electrical & electronic,imaging science & photographic technology
What problem does this paper attempt to address?