FocalUNETR: A Focal Transformer for Boundary-aware Segmentation of CT Images

Chengyin Li,Yao Qiang,Rafi Ibn Sultan,Hassan Bagher-Ebadian,Prashant Khanduri,Indrin J. Chetty,Dongxiao Zhu
2023-07-19
Abstract:Computed Tomography (CT) based precise prostate segmentation for treatment planning is challenging due to (1) the unclear boundary of the prostate derived from CT's poor soft tissue contrast and (2) the limitation of convolutional neural network-based models in capturing long-range global context. Here we propose a novel focal transformer-based image segmentation architecture to effectively and efficiently extract local visual features and global context from CT images. Additionally, we design an auxiliary boundary-induced label regression task coupled with the main prostate segmentation task to address the unclear boundary issue in CT images. We demonstrate that this design significantly improves the quality of the CT-based prostate segmentation task over other competing methods, resulting in substantially improved performance, i.e., higher Dice Similarity Coefficient, lower Hausdorff Distance, and Average Symmetric Surface Distance, on both private and public CT image datasets. Our code is available at this \href{<a class="link-external link-https" href="https://github.com/ChengyinLee/FocalUNETR.git" rel="external noopener nofollow">this https URL</a>}{link}.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address two main issues in prostate segmentation: 1. **Unclear Prostate Boundaries**: Due to the poor soft tissue contrast in CT images, the prostate boundaries are blurred, making manual segmentation time-consuming and susceptible to operator variability. 2. **Limited Ability of Convolutional Neural Network (CNN) Models to Capture Global Context Information**: Existing CNN-based models struggle to effectively capture long-range relationships and global context information. To tackle these problems, the authors propose a new image segmentation architecture based on the Focal Transformer—FocalUNETR. This architecture leverages the Focal Self-Attention mechanism to efficiently extract local visual features and global context information. Additionally, to address the issue of unclear prostate boundaries in CT images, the authors designed an auxiliary task, the boundary-aware contour regression task, to improve segmentation accuracy. Through experimental validation on multiple datasets, FocalUNETR demonstrates significantly better performance in prostate segmentation tasks compared to existing methods, particularly achieving better results in metrics such as Dice Similarity Coefficient, Hausdorff Distance, and Average Symmetric Surface Distance.