Abstract PO3-07-05: Multi-site validation of a deep learning solution for ER/PR profiling of breast cancer from H&E-stained pathology slides
Salim Arslan,Adrian Bazaga,Gareth Bryson,Oscar Carlos,Andre Geraldes,David Harrison,Alastair Ironside,Jakob Kather,Ali Khurram,David Leff,Debapriya Mehrotra,Foivos Ntelemis,John Nyonyintono,Julian Schmidt,Shikha Singhal,In Hwa Um,Steffen Wolf,Pahini Pandya
DOI: https://doi.org/10.1158/1538-7445.sabcs23-po3-07-05
IF: 11.2
2024-05-02
Cancer Research
Abstract:Abstract Background: Molecular profiling of estrogen and progesterone receptors (ER/PR/Her2) is performed for all malignant breast cancers to inform the choice of targeted therapy. Though existing scoring systems are widely used and well-validated, they can involve costly preparation and variable interpretation. Additionally, discordances between histology and expected biomarker findings can prompt repeat testing to address biological, interpretative, or technical reasons for unexpected results. We evaluate PANProfiler Breast(PPB), a UKCA/CE- IVDD marked deep learning (DL)-based image analysis software, on multiple sites to determine if the majority of ER/PR assays can be replaced, relying only on routinely-used H&E-stained whole slide images. Methods: PPB was trained and validated on 5126/4619 WSIs from 5 sites to identify the ER/PR status defined by IHC assays graded in alignment with ASCO/RCPATH guidelines from five different sites in the UK were used for training and validation. The performance is evaluated separately for each site with 3-fold cross-validation, mimicking real-world distribution. Results: For ER, with a class ratio (CR) of approximately 4:1, we measure a sensitivity, specificity, and accuracy of 95.5%(±1.5%), 47.5%(±15.1%) and 88.3%(±0.6%) averaged over all sites, reaching up to 97.40%, 67.20%, and 89.30% respectively. For PR (CR approx. 3:1), the averaged sensitivity, specificity, and accuracy are 92.2%(±8.4%), 53.10%(±20.8%), and 86.6%(±2.4%), reaching up to 99.1%, 81.7%, and 88.8%, respectively. The software's performance is comparable to current SoC antibody performance in common ER/PR CDx Assays from Dako, Leica, and Roche, which have sensitivities of 98.5%(±1.3%) and specificities of 38.6%(±6.3%) for ER and sensitivities of 96.9%(±0.6%) and specificities of 23.4%(±1.4%) for PR. Performance was robust to specimen and scanner types, with accuracies of 87.6% (ER, only biopsies), 88.2% (ER, only resections), 88.5% (ER mixed types), 83.9% (PR, only resections) and 87.3% (mixed types). Accuracy across scanners varied by a standard deviation of 0.3%/1.0% for ER/PR respectively. Conclusions: We demonstrate the robustness of a DL-based ER/PR profiling method in breast cancer using only H&E-stained WSIs. This multi-site validation study is the first-of-its-kind for such an approach using real-world clinical data. Our solution could facilitate fast, accurate, and systemic screening of patients for targeted treatments if integrated into routine pathological workflows. Citation Format: Salim Arslan, Adrian Bazaga, Gareth Bryson, Oscar Carlos, Andre Geraldes, David Harrison, Alastair Ironside, Jakob Kather, Ali Khurram, David Leff, Debapriya Mehrotra, Foivos Ntelemis, John Nyonyintono, Julian Schmidt, Shikha Singhal, In Hwa Um, Steffen Wolf, Pahini Pandya. Multi-site validation of a deep learning solution for ER/PR profiling of breast cancer from H&E-stained pathology slides [abstract]. In: Proceedings of the 2023 San Antonio Breast Cancer Symposium; 2023 Dec 5-9; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2024;84(9 Suppl):Abstract nr PO3-07-05.
oncology