DNA 5-methylcytosine detection and methylation phasing using PacBio circular consensus sequencing

Peng Ni,Zeyu Zhong,Jinrui Xu,Neng Huang,Jun Zhang,Fan Nie,Haochen Zhao,You Zou,Yuanfeng Huang,Jinchen Li,Chuan-Le Xiao,Feng Luo,Jianxin Wang
DOI: https://doi.org/10.1101/2022.02.26.482074
2022-03-01
Abstract:Abstract It has been reported recently that DNA 5-methylcytosine (5mC) in CpG contexts can be detected using PacBio circular consensus sequencing (CCS). However, the accuracy and robustness of computational methods using long CCS reads still need to be improved. In this study, we present a deep learning method, ccsmeth, to detect DNA 5mCpGs from PacBio CCS subreads. ccsmeth utilizes attention-based bidirectional Gated Recurrent Unit (GRU) networks to infer DNA methylation states. Testing ccsmeth using CCS subreads of amplified DNA and M.SssI-treated DNA, we found that ccsmeth achieved higher performances than existing methods. We also compared the results of ccsmeth on long CCS reads with bisulfite sequencing and Nanopore sequencing. The results demonstrated that ccsmeth can accurately detect 5mCpGs from CCS data sequenced using >10 kb insert library. Moreover, using PacBio CCS data, we proposed a pipeline which can detect haplotype-aware methylation in human.
What problem does this paper attempt to address?