A real-time deep learning-based system for colorectal polyp size estimation by white-light endoscopy: development and multicenter prospective validation

Jing Wang,Ying Li,Boru Chen,Du Cheng,Fei Liao,Tao Tan,Qinghong Xu,Zhifeng Liu,Yuan Huang,Ci Zhu,Wenbing Cao,Liwen Yao,Zhifeng Wu,Lianlian Wu,Chenxia Zhang,Bing Xiao,Ming Xu,Jun Liu,Shuyu Li,Honggang Yu
DOI: https://doi.org/10.1055/a-2189-7036
Endoscopy
Abstract:Background: The choice of polypectomy device and surveillance intervals for colorectal polyps are primarily decided by polyp size. We developed a deep learning-based system (ENDOANGEL-CPS) to estimate colorectal polyp size in real time. Methods: ENDOANGEL-CPS calculates polyp size by estimating the distance from the endoscope lens to the polyp using the parameters of the lens. The depth estimator network was developed on 7297 images from five virtually produced colon videos and tested on 730 images from seven virtual colon videos. The performance of the system was first evaluated in nine videos of a simulated colon with polyps attached, then tested in 157 real-world prospective videos from three hospitals, with the outcomes compared with that of nine endoscopists over 69 videos. Inappropriate surveillance recommendations caused by incorrect estimation of polyp size were also analyzed. Results: The relative error of depth estimation was 11.3% (SD 6.0%) in successive virtual colon images. The concordance correlation coefficients (CCCs) between system estimation and ground truth were 0.89 and 0.93 in images of a simulated colon and multicenter videos of 157 polyps. The mean CCC of ENDOANGEL-CPS surpassed all endoscopists (0.89 vs. 0.41 [SD 0.29]; P<0.001). The relative accuracy of ENDOANGEL-CPS was significantly higher than that of endoscopists (89.9% vs. 54.7%; P<0.001). Regarding inappropriate surveillance recommendations, the system's error rate is also lower than that of endoscopists (1.5% vs. 16.6%; P<0.001). Conclusions: ENDOANGEL-CPS could potentially improve the accuracy of colorectal polyp size measurements and size-based surveillance intervals.
What problem does this paper attempt to address?