Development and validation of a clinical breast cancer tool for accurate prediction of recurrence

Asim Dhungana,Augustin Vannier,Fangyuan Zhao,Jincong Q. Freeman,Poornima Saha,Megan Sullivan,Katharine Yao,Elbio M. Flores,Olufunmilayo I. Olopade,Alexander T. Pearson,Dezheng Huo,Frederick M. Howard
DOI: https://doi.org/10.1038/s41523-024-00651-5
2024-06-16
npj Breast Cancer
Abstract:Given high costs of Oncotype DX (ODX) testing, widely used in recurrence risk assessment for early-stage breast cancer, studies have predicted ODX using quantitative clinicopathologic variables. However, such models have incorporated only small cohorts. Using a cohort of patients from the National Cancer Database (NCDB, n = 53,346), we trained machine learning models to predict low-risk (0-25) or high-risk (26-100) ODX using quantitative estrogen receptor (ER)/progesterone receptor (PR)/Ki-67 status, quantitative ER/PR status alone, and no quantitative features. Models were externally validated on a diverse cohort of 970 patients (median follow-up 55 months) for accuracy in ODX prediction and recurrence. Comparing the area under the receiver operating characteristic curve (AUROC) in a held-out set from NCDB, models incorporating quantitative ER/PR (AUROC 0.78, 95% CI 0.77–0.80) and ER/PR/Ki-67 (AUROC 0.81, 95% CI 0.80–0.83) outperformed the non-quantitative model (AUROC 0.70, 95% CI 0.68–0.72). These results were preserved in the validation cohort, where the ER/PR/Ki-67 model (AUROC 0.87, 95% CI 0.81–0.93, p = 0.009) and the ER/PR model (AUROC 0.86, 95% CI 0.80–0.92, p = 0.031) significantly outperformed the non-quantitative model (AUROC 0.80, 95% CI 0.73–0.87). Using a high-sensitivity rule-out threshold, the non-quantitative, quantitative ER/PR and ER/PR/Ki-67 models identified 35%, 30% and 43% of patients as low-risk in the validation cohort. Of these low-risk patients, fewer than 3% had a recurrence at 5 years. These models may help identify patients who can forgo genomic testing and initiate endocrine therapy alone. An online calculator is provided for further study.
oncology
What problem does this paper attempt to address?
The paper aims to address the high cost of Oncotype DX (ODX) testing in breast cancer patients and attempts to develop a machine learning model based on clinicopathological features to predict the risk of breast cancer recurrence, thereby reducing unnecessary genomic testing. Specifically: 1. **High Cost Issue**: ODX testing is used to assess the recurrence risk in early-stage breast cancer patients, but its cost is approximately $4000, making it difficult to popularize in resource-limited settings. 2. **Limitations of Existing Models**: Previous studies have attempted to predict ODX results by quantifying clinicopathological variables, but these models typically include only small-scale cohorts. 3. **Development of a New Model**: The paper utilizes a large-scale dataset from the National Cancer Database (NCDB) (53,346 patients) to train multiple machine learning models to predict low-risk (0-25) or high-risk (26-100) ODX scores. 4. **Validation of Model Accuracy**: The model's performance in an external validation cohort (970 patients) was superior to models using only non-quantified features, especially when including estrogen receptor (ER), progesterone receptor (PR), and Ki-67 index. Through these methods, the researchers hope to identify low-risk patients who can avoid genomic testing and directly start endocrine therapy, thereby improving the efficiency and cost-effectiveness of clinical decision-making.