Deep-learning Based Automatic Delineation Improves CTV Contouring Quality and Efficiency for Pathological N2 (pn2) Non-small Cell Lung Cancer (NSCLC) Receiving Post-operation Radiation Therapy

J. Wang,T. Zhang,X. Chen,W. Xia,J. Miao,Z. Zhou,J. Dai,N. Bi
DOI: https://doi.org/10.1016/j.ijrobp.2019.06.473
2019-01-01
Abstract:Accurate delineation of clinical target volume (CTV) is one of the most crucial aspects of treatment planning in radiation therapy. However, the quality of this process depends on the expertise level of the individual observer. This study aimed to investigate whether deep-learning based automatic delineation could provide greater accuracy, inter-observer consistency and efficiency of CTV contour compared with manual delineation for pN2 NSCLC receiving post-operation radiation therapy (PORT). A very deep dilated residual network was used to achieve the effective automatic delineation of CTV. The dilated module with different atrous rates is able to extract multi-scale features from CT, leading to a greater robustness of the model. Eleven junior radiation oncologists (work experience ≤ 5 years) delineated CTV on 19 patients using two contour methods: (1) a manual contour from scratch (MC) and (2) a user adjustment of the deep-learning based auto-delineation contour (UADC). Three senior radiation oncologists (work experience > 10 years) also contoured the CTV on these patients and the majority voting was used to generate the ground truth (GT). The accuracy of junior’s delineation was evaluated in terms of mean distance to agreement (MDA) and Dice similarity coefficient (DSC) with the GT as reference. MDA is a distance parameter while DSC is a volume overlap index. Therefore, smaller MDA and higher DSC indicate a greater contour accuracy. The coefficient of variation (COV) was rendered to measure the inter-observer variability. COV is defined as the standard deviation (SD) divided by the mean CTV volume of all observers and a larger COV implies a greater variability. For each of the CTV contouring tasks, the time consumption was recorded. A total of 418 unique CTV sets were generated. The UADC presented a significantly smaller MDA (mm) per individual CTV set compared with MC (mean ± SD: 2.79 ± 0.91 vs. 3.07 ± 0.98, P < 0.001). Similarly, the DSC of UADC was also significantly greater than that of MC (mean ± SD: 0.75 ± 0.06 vs. 0.72 ± 0.07, P < 0.001). For inter-observer variability, UADC introduced a remarkably decreased COV in comparison with MC (mean ± SD: 0.129 ± 0.040 vs. 0.183 ± 0.043, P < 0.001). The median contouring time for MC and UADC was 14.99 min (P25, P75: 9.80, 20.87) and 10.42 min (P25, P75: 7.20, 17.03) respectively, resulting in a 31% reduction (absolute: 4.57 min) of time consumption (P < 0.001). Compared with the manual contour, user adjustment of deep-learning based auto-delineation of PORT-CTV for pN2 NSCLC is a promising strategy to offer superior accuracy and consistency within a shortened time span, leading to higher quality and efficiency of CTV contouring.
What problem does this paper attempt to address?