DISMIR: D eep learning-based noninvasive cancer detection by i ntegrating DNA s equence and methylation information of i ndividual cell-free DNA r eads

Jiaqi Li,Lei Wei,Xianglin Zhang,Wei Zhang,Haochen Wang,Bixi Zhong,Zhen Xie,Hairong Lv,Xiaowo Wang
DOI: https://doi.org/10.1093/bib/bbab250
IF: 9.5
2021-07-09
Briefings in Bioinformatics
Abstract:Abstract Detecting cancer signals in cell-free DNA (cfDNA) high-throughput sequencing data is emerging as a novel noninvasive cancer detection method. Due to the high cost of sequencing, it is crucial to make robust and precise predictions with low-depth cfDNA sequencing data. Here we propose a novel approach named DISMIR, which can provide ultrasensitive and robust cancer detection by integrating DNA sequence and methylation information in plasma cfDNA whole-genome bisulfite sequencing (WGBS) data. DISMIR introduces a new feature termed as ‘switching region’ to define cancer-specific differentially methylated regions, which can enrich the cancer-related signal at read-resolution. DISMIR applies a deep learning model to predict the source of every single read based on its DNA sequence and methylation state and then predicts the risk that the plasma donor is suffering from cancer. DISMIR exhibited high accuracy and robustness on hepatocellular carcinoma detection by plasma cfDNA WGBS data even at ultralow sequencing depths. Further analysis showed that DISMIR tends to be insensitive to alterations of single CpG sites’ methylation states, which suggests DISMIR could resist to technical noise of WGBS. All these results showed DISMIR with the potential to be a precise and robust method for low-cost early cancer detection.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?