A Novel Computational Complete Deconvolution Method Using RNA-seq Data

Kai Kang,Qian Meng,Igor Shats,David M. Umbach,Melissa Li,Yuanyuan Li,Xiaoling Li,Leping Li
DOI: https://doi.org/10.1101/496596
2018-01-01
Abstract:The cell type composition of many biological tissues varies widely across samples. Such sample heterogeneity hampers efforts to probe the role of each cell type in the tissue microenvironment. Current approaches that address this issue have drawbacks. Cell sorting or single-cell based experimental techniques disrupt in situ interactions and alter physiological status of cells in tissues. Computational methods are flexible and promising; but they often estimate either sample-specific proportions of each cell type or cell-type-specific gene expression profiles, not both, by requiring the other as input. We introduce a computational C omplete D econvolution method that can estimate both sample-specific proportions of each cell type and cell-type-specific gene expression profiles simultaneously using bulk RNA- Seq data only (CDSeq). We assessed our method’s performance using several synthetic and experimental mixtures of varied but known cell type composition and compared its performance to the performance of two state-of-the art deconvolution methods on the same mixtures. The results showed CDSeq can estimate both sample-specific proportions of each component cell type and cell-typespecificgene expression profiles with high accuracy. CDSeq holds promise for computationally deciphering complex mixtures of cell types, each with differing expression profiles, using RNA-seq data measured in bulk tissue (MATLAB code is available at ).
What problem does this paper attempt to address?