Systematic evaluation of cell-type deconvolution pipelines for sequencing-based bulk DNA methylomes

Yunhee Jeong,Lisa Barros de Andrade e Sousa,Dominik Thalmeier,Reka Toth,Marlene Ganslmeier,Kersten Breuer,Christoph Plass,Pavlo Lutsik
DOI: https://doi.org/10.1101/2021.11.29.470374
2021-12-01
Abstract:Abstract DNA methylation analysis by sequencing is becoming increasingly popular, yielding methylomes at single-base pair resolution. It has tremendous potential for cell-type heterogeneity analysis with intrinsic read-level information. Although diverse deconvolution methods were developed to infer cell-type composition based on bulk sequencing-based methylomes, the systematic evaluation has not been performed yet. Here, we thoroughly benchmark six previously published methods: Bayesian epiallele detection (BED), DXM, PRISM, csmFinder+coMethy, ClubCpG and MethylPurify, together with two array-based methods, MeDeCom and Houseman, as a comparison group. Sequencing-based deconvolution methods consist of two main steps, informative region selection and cell-type composition estimation, thus each was individually assessed. With these sophisticated evaluation, we demonstrate the method achieving the highest performance in different types of samples. We found that cell-type deconvolution performance is influenced by different factors depending on the number of cell types within the mixture. Finally, we propose a best-practice deconvolution strategy for sequencing data and limitations which need to be handled.
What problem does this paper attempt to address?