Estimation of the methylation pattern distribution from deep sequencing data

Peijie Lin,Sylvain Forêt,Susan R Wilson,Conrad J Burden
DOI: https://doi.org/10.1186/s12859-015-0600-6
IF: 3.307
2015-05-06
BMC Bioinformatics
Abstract:BackgroundBisulphite sequencing enables the detection of cytosine methylation. The sequence of the methylation states of cytosines on any given read forms a methylation pattern that carries substantially more information than merely studying the average methylation level at individual positions. In order to understand better the complexity of DNA methylation landscapes in biological samples, it is important to study the diversity of these methylation patterns. However, the accurate quantification of methylation patterns is subject to sequencing errors and spurious signals due to incomplete bisulphite conversion of cytosines.ResultsA statistical model is developed which accounts for the distribution of DNA methylation patterns at any given locus. The model incorporates the effects of sequencing errors and spurious reads, and enables estimation of the true underlying distribution of methylation patterns.ConclusionsCalculation of the estimated distribution over methylation patterns is implemented in the R Bioconductor package MPFE. Source code and documentation of the package are also available for download at http://bioconductor.org/packages/3.0/bioc/html/MPFE.html.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?