Abstract LB243: Deep learning-based molecular characterization of lung cancers from never smokers using hematoxylin and eosin-stained whole slide images
Monjoy Saha,Tongwu Zhang,Praphulla Bhawsar,Wei Zhao,Jianxin Shi,Soo Ryum Yang,Jonas Almeida,Maria Teresa Landi
DOI: https://doi.org/10.1158/1538-7445.am2024-lb243
IF: 11.2
2024-04-07
Cancer Research
Abstract:Purpose: This study employs cutting-edge deep learning techniques to comprehensively analyze hematoxylin and eosin-stained (H&E) whole slide images (WSIs) to predict which tumors carry driver gene alterations, mutational signatures, and additional molecular features among lung cancers from never-smokers (LCINS)— the seventh leading cause of cancer death worldwide. Method: A total of 464 H&E-stained WSIs of lung adenocarcinomas from the Sherlock-Lung study were utilized, with 325 WSIs for model training and 139 for testing. Alongside the WSIs, genomic data, including mutations (overall nonsynonymous mutations and driver mutations only), fusions, or copy numbers in driver genes (TP53, KRAS, EGFR, CDKN2A, MDM2, ALK, RBM10), mutational signatures (APOBEC), molecular features (Kataegies, Whole Genome Doubling (WGD) status, Tumor Mutational Burden (TMB)), and specific hotspot driver mutation in EGFR (p.L858R and p.E746_A750del) and KRAS (p.G12C, p.G12V, and p.G12D), were included for all tumors. Each WSI was assigned different labels based on the presence or absence of these genomic alterations or features. We used a customized multilabel binary deep learning model based on ResNet50, a residual convolutional neural network, for analyzing the data. The network was trained from scratch, with an initial epoch set to 100. Results: Our methodology demonstrated high predictive performance, measured by the area under the receiver operating characteristic (AUROC). In the presence or absence category, we achieved high AUROC for detecting any alterations in these driver genes: TP53 (0.81), KRAS (0.92), EGFR (0.93), CDKN2A deletion (0.88), MDM2 amplification (0.94), and ALK fusion (0.86). We also evaluated molecular features, obtaining AUROC scores of 0.79 for WGD status and 0.86 for Kataegies. Moderate AUROC scores were observed for tumors with RBM10 mutations (0.67), high TMB (0.64), and APOBEC signatures (0.56). When focusing on driver mutations only, we achieved high AUROC scores for EGFR (0.88), KRAS (0.84), TP53 (0.93), and RBM10 (0.78). In addition, our model successfully predicted tumors with specific mutations in EGFR (p.L858R = 0.79; p.E746_A750del = 0.77) and KRAS (p.G12C = 0.77). Performance was suboptimal for KRAS p.G12V (0.41) and KRAS p.G12D (0.47). Conclusions: Our deep learning network achieves high prediction scores in identifying tumors with critical driver gene alterations and actionable mutations, holding promise for potential clinical use. In the future, the model could be optimized as a screening assay to guide molecular testing and therapeutic management of patients with LCINS. Citation Format: Monjoy Saha, Tongwu Zhang, Praphulla Bhawsar, Wei Zhao, Jianxin Shi, Soo Ryum Yang, Jonas Almeida, Maria Teresa Landi. Deep learning-based molecular characterization of lung cancers from never smokers using hematoxylin and eosin-stained whole slide images [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 2 (Late-Breaking, Clinical Trial, and Invited s); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(7_Suppl) nr LB243.
oncology