Abstract 6697: Deep learning algorithm for cancer detection using multimodal characteristics of whole methylome sequencing of cf-DNA
Juntae Park,Minjung Kim,Sook Ryun Park,Ki-Byung Song,Eunsung Jun,Dongryul Oh,Jeong-Won Lee,Young Sik Park,Ki-Won Song,Jeong-Sik Byeon,Bo Hyun Kim,Chang-Seok Ki,Eunhae Cho
DOI: https://doi.org/10.1158/1538-7445.am2023-6697
IF: 11.2
2023-04-04
Cancer Research
Abstract:Abstract Background Various cell-free DNA (cfDNA) features including methylation and genomic profiles have been investigated for their potential use in early cancer detection. We developed deep learning models based the data generated by the enzymatic conversion based whole methylome sequencing of cfDNA. Methods Cell-free whole genome Enzymatic Methyl sequencing(cfWEMseq) data were generated from 198 cancer patients (stage I: 11%, II: 17%, III: 22%, IV: 20%, unknown: 31%) and 69 healthy controls. The cancer types were consisted of breast (n=31), liver (n=24), esophageal (n=38), pancreatic (n=30), colon (n=34), ovarian (n=18), and lung (n=23). Sequence data was produced on average of 200 million reads using Novaseq 6000 (Illumina). For model training and evaluation, data partitioning was stratified by cancer type, and 5-fold cross validation was used. Coverage and methylation beta values were calculated by binning at fixed size of 100K, 1M, and 5M base and variable size from Topologically Associated Domains (TAD). Genome Coverage (GC), Genome Methylation Beta values (GMB), and Mutation Signature (MS) features were trained using a one-dimensional convolutional neural network (1D-CNN). The performance of the model was evaluated by measuring the average value of the results measured in each test set of 5 fold. Results We tested the cancer detection performance of various feature combinations using all data from cfWEMseq (n=267). Regardless of the bin size, the GMB single model achieved higher performance than the GC single model. The best-performing model is the ensemble model of GMB (100k bin) and MS. The cancer detection performance of this ensemble model reached an accuracy 96% (CI: 93.6% to 98.1%), AUC 0.99 (CI: 0.97 to 1.0) and sensitivity 98.0% (CI: 92.4% to 99.5%) with a specificity of 90%. Conclusions These results provide an opportunity for higher accuracies by integrating methylation information and genomic data using cfWEMseq. This research was supported through the National Research Foundation (NRF) funded by the Ministry of Science and ICT (2020M2D9A3094213). Citation Format: Juntae Park, Minjung Kim, Sook Ryun Park, Ki-Byung Song, Eunsung Jun, Dongryul Oh, Jeong-Won Lee, Young Sik Park, Ki-Won Song, Jeong-Sik Byeon, Bo Hyun Kim, Chang-Seok Ki, Eunhae Cho. Deep learning algorithm for cancer detection using multimodal characteristics of whole methylome sequencing of cf-DNA. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 6697.
oncology