Complete deconvolution of DNA methylation signals from complex tissues: a geometric approach

Weiwei Zhang,Hao Wu,Ziyi Li
DOI: https://doi.org/10.1093/bioinformatics/btaa930
IF: 5.8
2020-12-11
Bioinformatics
Abstract:Abstract Motivation It is a common practice in epigenetics research to profile DNA methylation on tissue samples, which is usually a mixture of different cell types. To properly account for the mixture, estimating cell compositions has been recognized as an important first step. Many methods were developed for quantifying cell compositions from DNA methylation data, but they mostly have limited applications due to lack of reference or prior information. Results We develop Tsisal, a novel complete deconvolution method which accurately estimate cell compositions from DNA methylation data without any prior knowledge of cell types or their proportions. Tsisal is a full pipeline to estimate number of cell types, cell compositions and identify cell-type-specific CpG sites. It can also assign cell type labels when (full or part of) reference panel is available. Extensive simulation studies and analyses of seven real datasets demonstrate the favorable performance of our proposed method compared with existing deconvolution methods serving similar purpose. Availability and implementation The proposed method Tsisal is implemented as part of the R/Bioconductor package TOAST at https://bioconductor.org/packages/TOAST. Supplementary information Supplementary data are available at Bioinformatics online.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?