Abstract:Abstract Background: Knowledge of a patient’s tumor type is essential for guiding clinical treatment decisions in cancer, but histologically-based diagnosis remains challenging for a subset of cancers. Genomic alterations are highly indicative of tumor type and can be used to build classifiers that predict diagnoses, but most genomic-based classification methods use whole genome sequencing (WGS) data which is not feasible for widespread clinical implementation at present. Clinical sequencing is typically performed using cancer gene panels that target individual mutations, often drivers, but previous tumor type classifiers developed using driver-based features alone perform poorly. We hypothesize that a classifier developed using state-of-the-art deep-learning methods and a sufficiently large training cohort would be able to overcome previous accuracy limitations and support the development of a clinically-relevant tumor type prediction model. Methods: We present Deep Genome-Derived Diagnosis (GDD-ENS), an ensemble-based deep-learning tumor type classification method trained using data from cancer gene panel sequencing. We specifically use data from MSK-IMPACT, an FDA-authorized clinical sequencing assay that reports genomic alterations including mutations, indels, copy number alterations, and gene fusions across 505 cancer-associated genes. We aggregated a discovery cohort of 35,372 patients with solid tumors profiled with MSK-IMPACT across 38 common cancer types and used this set to generate 4,487 somatic mutation features for development. Results: GDD-ENS achieves 78.8% accuracy on a held out validation cohort of 6971 patients. For the 71.9% of predictions assigned a high confidence by the model, accuracy increases to 92.7%, rivaling WGS-based models. We use Shapley Values to report prediction-specific feature importance, and aggregate them across cancer types to show GDD-ENS identifies known cancer type-genomic alteration trends. GDD-ENS also, with high accuracy, identifies patients with cancer types not included in the 38 common types using metrics derived from ensemble statistics. For patients where non-genomic information could further guide predictions, we implement a customizable prediction-specific adaptive prior distribution and report improved accuracy after adjusting predictions to account for features such as metastatic sample biopsy site. Finally, we apply GDD-ENS to a set of 1,123 patients with Cancers of Unknown Primary (CUP) and return high confidence predictions for 49% of these patients, in some cases matching predictions on CUP samples with diagnoses that were later confirmed through additional sampling and disease progression. Conclusions: Integrating GDD-ENS into prospective clinical sequencing workflows will enable clinically-relevant tumor type predictions that can guide treatment decisions in real-time. Citation Format: Madison Darmofal, Shalabh Suman, Gurnit Atwal, Jie-Fu Chen, Anna Varghese, Jason C. Chang, Anoop Balakrishnan Rema, Aijazuddin Syed, Quaid Morris, Michael Berger. Deep-learning model for tumor type classification enables enhanced clinical decision support in cancer diagnosis. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 5440.

Abstract 4884: Evaluating the utility of in silico variant annotation tools for cancer driver detection

Abstract 1252: AI-derived predictions improve identification of real-world cancer driver mutations

Evaluation of for variant classification in missense variants of solid cancer with actionable genetic targets

AI-derived comparative assessment of the performance of pathogenicity prediction tools on missense variants of breast cancer genes

Abstract 3971: DrGaP: A Powerful Tool for Identifying Driver Genes and Pathways in Cancer Sequencing Studies

Abstract 151: an Improved Computational Pipeline for Tumor Somatic Alterations Detection

Assessing concordance among human, in silico predictions and functional assays on genetic variant classification.

Abstract 861: Improvements in variant calling sensitivity and specificity in single-cell DNA sequencing using deep learning

Abstract 2267: Assessment of AlphaMissense and structure-function predictions demonstrates efficient reclassification of genetic variants of unknown pathogenicity in inherited myeloid neoplasms

Abstract 755: Supporting Precision Cancer Treatment Decision with Functional Evaluation of Cancer Gene Mutations and Variants

Abstract 3012: Characterizing the functional significance of "variant of uncertain significance" of the tumour suppressor CDH1

Abstract 3536: Paracell: A high throughput, deep learning-based pipeline for single-cell phenotypic profiling

Abstract 5440: Deep-learning model for tumor type classification enables enhanced clinical decision support in cancer diagnosis

Abstract 2928: Solving the cancer mutation conundrum: A single cell, massively parallel approach for cancer mutation discovery, genome modelling and functional characterization

Abstract 909: Enhancing genomic analysis in cancer diagnostics: A machine learning approach for removing artifacts in FFPE specimens

Abstract 2315: AI-enabled precision oncology era: Advanced and interactive interpretation of next-gneneration sequencing (NGS) reports

Abstract 889: Machine learning-enhanced targeted versus whole-exome sequencing as a guide to cancer care

Abstract 7393: Tumor model to tumor treatment: Applying deep learning approaches to map multimodal data from cancer model systems to patients

Abstract 6085: Leveraging targeted epigenetic and genetic detection for cost-effective cancer classification

Comparing the performance of selected variant callers using synthetic data and genome segmentation

Abstract 3522: Application of large language models to nucleotide sequences for profiling signaling pathway disruptions in ovarian cancer patients