3D RoI-aware U-Net for Accurate and Efficient Colorectal Tumor Segmentation

Yi-Jie Huang,Qi Dou,Zi-Xian Wang,Li-Zhi Liu,Ying Jin,Chao-Feng Li,Lisheng Wang,Hao Chen,Rui-Hua Xu
DOI: https://doi.org/10.1109/TCYB.2020.2980145
2019-02-15
Abstract:Segmentation of colorectal cancerous regions from 3D Magnetic Resonance (MR) images is a crucial procedure for radiotherapy which conventionally requires accurate delineation of tumour boundaries at an expense of labor, time and reproducibility. While deep learning based methods serve good baselines in 3D image segmentation tasks, small applicable patch size limits effective receptive field and degrades segmentation performance. In addition, Regions of interest (RoIs) localization from large whole volume 3D images serves as a preceding operation that brings about multiple benefits in terms of speed, target completeness, reduction of false positives. Distinct from sliding window or non-joint localization-segmentation based models, we propose a novel multitask framework referred to as 3D RoI-aware U-Net (3D RU-Net), for RoI localization and in-region segmentation where the two tasks share one backbone encoder network. With the region proposals from the encoder, we crop multi-level RoI in-region features from the encoder to form a GPU memory-efficient decoder for detailpreserving segmentation and therefore enlarged applicable volume size and effective receptive field. To effectively train the model, we designed a Dice formulated loss function for the global-to-local multi-task learning procedure. Based on the efficiency gains, we went on to ensemble models with different receptive fields to achieve even higher performance costing minor extra computational expensiveness. Extensive experiments were conducted on 64 cancerous cases with a four-fold cross-validation, and the results showed significant superiority in terms of accuracy and efficiency over conventional frameworks. In conclusion, the proposed method has a huge potential for extension to other 3D object segmentation tasks from medical images due to its inherent generalizability. The code for the proposed method is publicly available.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?