Self-supervised monocular depth estimation for gastrointestinal endoscopy

Yuying Liu,Siyang Zuo
DOI: https://doi.org/10.1016/j.cmpb.2023.107619
Abstract:Background and objective: Gastrointestinal (GI) endoscopy represents a promising tool for GI cancer screening. However, the limited field of view and uneven skills of endoscopists make it remains difficult to accurately identify polyps and follow up on precancerous lesions under endoscopy. Estimating depth from GI endoscopic sequences is essential for a series of AI-assisted surgical techniques. Nonetheless, depth estimation algorithm of GI endoscopy is a challenging task due to the particularity of the environment and the limitation of datasets. In this paper, we propose a self-supervised monocular depth estimation method for GI endoscopy. Methods: A depth estimation network and a camera ego-motion estimation network are firstly constructed to obtain the depth information and pose information of the sequence respectively, and then the model is enabled to perform self-supervised training by calculating the multi-scale structural similarity with L1 norm (MS-SSIM+L1) loss function between the target frame and the reconstructed image as part of the loss of the training network. The MS-SSIM+L1 loss function is good for reserving high-frequency information and can maintain the invariance of brightness and color. Our model consists of the U-shape convolutional network with the dual-attention mechanism, which is beneficial to capture muti-scale contextual information, and greatly improves the accuracy of depth estimation. We evaluated our method qualitatively and quantitatively with different state-of-the-art methods. Results and conclusions: The experimental results manifest that our method has superior generality, achieving lower error metrics and higher accuracy metrics on both the UCL dataset and the Endoslam dataset. The proposed method has also been validated with clinical GI endoscopy, demonstrating the potential clinical value of the model.
What problem does this paper attempt to address?