Deep Learning-Based Segmentation of Locally Advanced Breast Cancer on MRI in Relation to Residual Cancer Burden: A Multi-Institutional Cohort Study

Markus H A Janse,Liselore M Janssen,Bas H M van der Velden,Maaike R Moman,Elian J M Wolters-van der Ben,Marc C J M Kock,Max A Viergever,Paul J van Diest,Kenneth G A Gilhuijs
DOI: https://doi.org/10.1002/jmri.28679
Abstract:Background: While several methods have been proposed for automated assessment of breast-cancer response to neoadjuvant chemotherapy on breast MRI, limited information is available about their performance across multiple institutions. Purpose: To assess the value and robustness of deep learning-derived volumes of locally advanced breast cancer (LABC) on MRI to infer the presence of residual disease after neoadjuvant chemotherapy. Study type: Retrospective. Subjects: Training cohort: 102 consecutive female patients with LABC scheduled for neoadjuvant chemotherapy (NAC) from a single institution (age: 25-73 years). Independent testing cohort: 55 consecutive female patients with LABC from four institutions (age: 25-72 years). Field strength/sequence: Training cohort: single vendor 1.5 T or 3.0 T. Testing cohort: multivendor 3.0 T. Gradient echo dynamic contrast-enhanced sequences. Assessment: A convolutional neural network (nnU-Net) was trained to segment LABC. Based on resulting tumor volumes, an extremely randomized tree model was trained to assess residual cancer burden (RCB)-0/I vs. RCB-II/III. An independent model was developed using functional tumor volume (FTV). Models were tested on an independent testing cohort and response assessment performance and robustness across multiple institutions were assessed. Statistical tests: The receiver operating characteristic (ROC) was used to calculate the area under the ROC curve (AUC). DeLong's method was used to compare AUCs. Correlations were calculated using Pearson's method. P values <0.05 were considered significant. Results: Automated segmentation resulted in a median (interquartile range [IQR]) Dice score of 0.87 (0.62-0.93), with similar volumetric measurements (R = 0.95, P < 0.05). Automated volumetric measurements were significantly correlated with FTV (R = 0.80). Tumor volume-derived from deep learning of DCE-MRI was associated with RCB, yielding an AUC of 0.76 to discriminate between RCB-0/I and RCB-II/III, performing similar to the FTV-based model (AUC = 0.77, P = 0.66). Performance was comparable across institutions (IQR AUC: 0.71-0.84). Data conclusion: Deep learning-based segmentation estimates changes in tumor load on DCE-MRI that are associated with RCB after NAC and is robust against variations between institutions. Evidence level: 2. Technical efficacy: Stage 4.
What problem does this paper attempt to address?