Evaluation of Deep Learning‐based Auto‐segmentation Algorithms for Delineating Clinical Target Volume and Organs at Risk Involving Data for 125 Cervical Cancer Patients

Zhi Wang,Yankui Chang,Zhao Peng,Yin Lv,Weijiong Shi,Fan Wang,Xi Pei,X. George Xu
DOI: https://doi.org/10.1002/acm2.13097
2020-01-01
Journal of Applied Clinical Medical Physics
Abstract:Objective To evaluate the accuracy of a deep learning-based auto-segmentation mode to that of manual contouring by one medical resident, where both entities tried to mimic the delineation "habits" of the same clinical senior physician. Methods This study included 125 cervical cancer patients whose clinical target volumes (CTVs) and organs at risk (OARs) were delineated by the same senior physician. Of these 125 cases, 100 were used for model training and the remaining 25 for model testing. In addition, the medical resident instructed by the senior physician for approximately 8 months delineated the CTVs and OARs for the testing cases. The dice similarity coefficient (DSC) and the Hausdorff Distance (HD) were used to evaluate the delineation accuracy for CTV, bladder, rectum, small intestine, femoral-head-left, and femoral-head-right. Results The DSC values of the auto-segmentation model and manual contouring by the resident were, respectively, 0.86 and 0.83 for the CTV (P < 0.05), 0.91 and 0.91 for the bladder (P > 0.05), 0.88 and 0.84 for the femoral-head-right (P < 0.05), 0.88 and 0.84 for the femoral-head-left (P < 0.05), 0.86 and 0.81 for the small intestine (P < 0.05), and 0.81 and 0.84 for the rectum (P > 0.05). The HD (mm) values were, respectively, 14.84 and 18.37 for the CTV (P < 0.05), 7.82 and 7.63 for the bladder (P > 0.05), 6.18 and 6.75 for the femoral-head-right (P > 0.05), 6.17 and 6.31 for the femoral-head-left (P > 0.05), 22.21 and 26.70 for the small intestine (P > 0.05), and 7.04 and 6.13 for the rectum (P > 0.05). The auto-segmentation model took approximately 2 min to delineate the CTV and OARs while the resident took approximately 90 min to complete the same task. Conclusion The auto-segmentation model was as accurate as the medical resident but with much better efficiency in this study. Furthermore, the auto-segmentation approach offers additional perceivable advantages of being consistent and ever improving when compared with manual approaches.
What problem does this paper attempt to address?