Depth Estimation from Single-shot Monocular Endoscope Image Using Image Domain Adaptation And Edge-Aware Depth Estimation

Masahiro Oda,Hayato Itoh,Kiyohito Tanaka,Hirotsugu Takabatake,Masaki Mori,Hiroshi Natori,Kensaku Mori
DOI: https://doi.org/10.1080/21681163.2021.2012835
2022-01-12
Abstract:We propose a depth estimation method from a single-shot monocular endoscopic image using Lambertian surface translation by domain adaptation and depth estimation using multi-scale edge loss. We employ a two-step estimation process including Lambertian surface translation from unpaired data and depth estimation. The texture and specular reflection on the surface of an organ reduce the accuracy of depth estimations. We apply Lambertian surface translation to an endoscopic image to remove these texture and reflections. Then, we estimate the depth by using a fully convolutional network (FCN). During the training of the FCN, improvement of the object edge similarity between an estimated image and a ground truth depth image is important for getting better results. We introduced a muti-scale edge loss function to improve the accuracy of depth estimation. We quantitatively evaluated the proposed method using real colonoscopic images. The estimated depth values were proportional to the real depth values. Furthermore, we applied the estimated depth images to automated anatomical location identification of colonoscopic images using a convolutional neural network. The identification accuracy of the network improved from 69.2% to 74.1% by using the estimated depth images.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem this paper attempts to address is the estimation of depth information from single-shot monocular endoscopic images. Specifically, the paper focuses on the following points: 1. **Automated Understanding and Diagnostic Assistance**: Automated understanding of endoscopic images is crucial for diagnostic and therapeutic assistance. In addition to the images themselves, depth information can enhance the accuracy of understanding endoscopic images, such as measuring the size of lesions. 2. **Limitations of Existing Methods**: Existing depth estimation methods typically rely on stereo cameras or time-series images. However, many endoscopic imaging systems do not support stereo endoscopy or video capture. Moreover, for retrospective studies of endoscopic image analysis, it is necessary to automatically classify or identify a large number of stored single-shot monocular endoscopic images. 3. **Technical Challenges**: The main difficulty in estimating depth from single-shot monocular endoscopic images lies in the inability to obtain real endoscopic images and their corresponding depth images. Due to size constraints, depth sensors cannot be installed on endoscopes. To address these issues, the paper proposes a new depth estimation method that uses domain adaptation techniques and a multi-scale edge loss function to improve the accuracy of depth estimation. The specific steps include: - **Lambertian Surface Transformation**: Using domain adaptation techniques to convert real endoscopic images into Lambertian surface images to remove textures and specular reflections from organ surfaces. - **Depth Estimation Network**: Using a fully convolutional network (FCN) for depth estimation and introducing a multi-scale edge loss function to improve the accuracy of depth estimation. - **Quantitative Evaluation**: Conducting quantitative evaluations on real human datasets, the results show that the estimated depth values are proportional to the actual depth values, and using the estimated depth images can improve the accuracy of convolutional neural networks (CNNs) in automatic anatomical location recognition. In summary, this paper aims to improve the application effectiveness of single-shot monocular endoscopic images in medical diagnosis and therapeutic assistance by enhancing depth estimation methods.