Abstract 3499: Non-small cell lung cancer (NSCLC) histology classification using DNA methylation data captured from liquid biopsies

Laura Tung,Denis Tolkunov,Catalin Barbacioru,Deepika Kale,Scott Shell,Jill Tsai,Leylah Drusbosky,Kaushal Parikh,Han-Yu Chuang
DOI: https://doi.org/10.1158/1538-7445.am2024-3499
IF: 11.2
2024-03-22
Cancer Research
Abstract:Abstract Background: Non-small cell lung cancer (NSCLC) makes up ~85% of lung cancers, and its major subtypes are adenocarcinoma (AD) and squamous cell carcinoma (SC). The histology classification of NSCLC has significant therapeutic and prognostic implications. As liquid biopsies are rapid and non-invasive, computational methods to classify NSCLC subtypes using cell-free DNA (cfDNA) are desired. Methods: Using The Cancer Genome Atlas (TCGA) NSCLC methylation data, we developed a method to classify AD and SC subtypes from cfDNA samples. We mapped TCGA methylation data to Guardant Infinity’s promoter regions and transformed them to allow for the transfer of the TCGA trained model to Infinity cfDNA epigenomic data. An advantage of this approach is that TCGA training data originates from the primary tumor, while cfDNA samples have low tumor fraction (TF) and may also contain signals from metastatic tumors. We trained a penalized Logistic Regression model with LASSO. To better fit the model to Infinity cfDNA data, we performed region filtering. Results: Our model effectively classified AD and SC subtypes from TCGA samples (AUC 0.97) (10-fold cross validation; Table 1). When evaluating on cfDNA samples with epigenomic TF ≥ 1%, the final model shows reasonable performance (AUC 0.94) (Table 1); the performance decreases when reduced to TF > 0.05% (AUC 0.89) as expected (Table 1). When evaluating cfDNA from uncategorized lung samples, the predicted AD probabilities have significant association with KRAS mutations (p-value = 0.027). Conclusion: Our results show that the penalized Logistic Regression model trained with tissue TCGA methylation data through mapping, transformation, and region selection can classify NSCLC subtypes from cfDNA samples. This method may offer a fast, non-invasive tool to inform subtype-based clinical decisions, especially in the setting of poorly-differentiated carcinomas, and to identify subtype switches upon resistance to targeted therapies. Table 1: Model evaluation on tissue TCGA samples and cfDNA Infinity samples (different TF cutoffs) Sample type N samples AUC Accuracy Sensitivity Specificity Precision F1 score tissue TCGA 907 0.97 0.93 0.94 +/- 0.02 0.93 +/- 0.02 0.94 +/- 0.02 0.94 +/- 0.01 cfDNA Infinity 17 (TF ≥ 1%) 0.94 0.88 0.83 +/- 0.3 0.91 +/- 0.17 0.83 +/- 0.3 0.83 +/- 0.21 cfDNA Infinity 29 (TF > 0.05%) 0.89 0.83 0.88 +/- 0.23 0.81 +/- 0.17 0.64 +/- 0.28 0.74 +/- 0.2 Citation Format: Laura Tung, Denis Tolkunov, Catalin Barbacioru, Deepika Kale, Scott Shell, Jill Tsai, Leylah Drusbosky, Kaushal Parikh, Han-Yu Chuang. Non-small cell lung cancer (NSCLC) histology classification using DNA methylation data captured from liquid biopsies [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 3499.
oncology
What problem does this paper attempt to address?