Dual convolution-transformer UNet (DCT-UNet) for organs at risk and clinical target volume segmentation in MRI for cervical cancer brachytherapy
Gayoung Kim,Akila N Viswanathan,Rohini Bhatia,Yosef Landman,Michael Roumeliotis,Beth Erickson,Ehud J Schmidt,Junghoon Lee
DOI: https://doi.org/10.1088/1361-6560/ad84b2
IF: 3.5
2024-10-09
Physics in Medicine and Biology
Abstract:Objective. MRI is the standard imaging modality for high-dose-rate brachytherapy of cervical cancer. Precise contouring of organs at risk (OARs) and high-risk clinical target volume (HR-CTV) from MRI is a crucial step for radiotherapy planning and treatment. However, conventional manual contouring has limitations in terms of accuracy as well as procedural time. To overcome these, we propose a deep learning approach to automatically segment OARs (bladder, rectum, and sigmoid colon) and HR-CTV from female pelvic MRI. Approach. In the proposed pipeline, a coarse multi-organ segmentation model first segments all structures, from which a region of interest is computed for each structure. Then, each organ is segmented using an organ-specific fine segmentation model separately trained for each organ. To account for variable sizes of HR-CTV, a size-adaptive multi-model approach was employed. For coarse and fine segmentations, we designed a dual convolution-transformer UNet (DCT-UNet) which uses dual-path encoder consisting of convolution and transformer blocks. To evaluate our model, OAR segmentations were compared to the clinical contours drawn by the attending radiation oncologist. For HR-CTV, four sets of contours (clinical + three additional sets) were obtained to produce a consensus ground truth as well as for inter/intra-observer variability analysis. Main results. DCT-UNet achieved dice similarity coefficient (mean±SD) of 0.932±0.032 (bladder), 0.786±0.090 (rectum), 0.663±0.180 (sigmoid colon), and 0.741±0.076 (HR-CTV), outperforming other state-of-the-art models. Notably, the size-adaptive multi-model significantly improved HR-CTV segmentation compared to a single-model. Furthermore, significant inter/intra-observer variability was observed, and our model showed comparable performance to all observers. Computation time for the entire pipeline per subject was 12.59±0.79 seconds, which is significantly shorter than the typical manual contouring time of >15 minutes. Significance. These experimental results demonstrate that our model has great utility in cervical cancer brachytherapy by enabling fast and accurate automatic segmentation, and has potential in improving consistency in contouring.
engineering, biomedical,radiology, nuclear medicine & medical imaging