Predicting CpG Methylation Levels by Integrating Infinium HumanMethylation450 BeadChip Array Data.

Shicai Fan,Kang Huang,Rizi Ai,Mengchi Wang,Wei Wang
DOI: https://doi.org/10.1016/j.ygeno.2016.02.005
IF: 4.31
2016-01-01
Genomics
Abstract:The Infinium HumanMethylation450 BeadChip array, referred as 450K array hereinafter, has been widely adopted as an affordable technique to determine DNA methylation. Tens of thousands of data have been generated on diverse cell types and patient tissues, which have provided great insight into understanding the crucial roles of epigenetic modifications in many biological processes and diseases. The limitation of this technique is its coverage, which measures methylation levels of about 450,000 CpGs, accounting for about 1.6% of all CpGs in the human genome. In the present study we developed and compared computational models to significantly expand the coverage of Illumina 450K (~11 folds). Using the whole genome bisulfite sequencing and Illumina 450K data in the human H1 embryonic stem cell, we showed that the predicted and measured methylation levels were well correlated. Our proposed model showed superior prediction accuracies compared to the existing methods on the same dataset. When applied to predict the DNA methylome on other cells, our proposed model achieved comparable performance in cross-validations, which indicates the generalizibility of the method. Our method would thus be invaluable to maximize the usage of the existing data.
What problem does this paper attempt to address?