A unified model for cell-type resolution genomics from heterogeneous omics data

Zeyuan Johnson Chen,Elior Rahmani,Eran Halperin
DOI: https://doi.org/10.1101/2024.01.27.577588
2024-04-06
Abstract:The vast majority of population-scale genomic datasets collected to date consist of “bulk” samples obtained from heterogeneous tissues, reflecting mixtures of different cell types. In order to facilitate discovery at the cell-type level, there is a pressing need for computational deconvolution methods capable of leveraging the multitude of underutilized bulk profiles already collected across various organisms, tissues, and conditions. Here, we introduce Unico, a unified cross-omics method designed to deconvolve standard 2-dimensional bulk matrices of samples by features into 3-dimensional tensors representing samples by features by cell types. Unico stands out as the first principled model-based deconvolution method that is theoretically justified for any heterogeneous genomic data. Through the deconvolution of bulk gene expression and DNA methylation datasets, we demonstrate that the transferability of Unico across different data modalities translates into superior performance compared to existing approaches. This advancement enhances our capability to conduct powerful large-scale genomic studies at cell-type resolution without the need for cell sorting or single-cell biology. An R implementation of Unico is available on CRAN.
Genomics
What problem does this paper attempt to address?