Rediscovering publicly available single-cell data with the DISCO platform

Mengwei Li,Kok Siong Ang,Brian Teo,Uddamvathanak Rom,Minh N Nguyen,Sebastian Maurer-Stroh,Jinmiao Chen
DOI: https://doi.org/10.1093/nar/gkae1108
2024-11-13
Abstract:Single-cell RNA sequencing (scRNA-seq) has emerged as the key technique for studying transcriptomics at the single-cell level. In our previous work, we presented the DISCO database (https://www.immunesinglecell.org/) that integrates publicly available human scRNA-seq data. We now introduce an enhanced version of DISCO, which has expanded fourfold to include >100 million cells from >17 thousand samples. It provides uniformly realigned read count tables, curated metadata, integrated tissue and phenotype specific atlases, and harmonized cell type annotations. It also hosts a single-cell enhanced knowledgebase of cell type ontology and gene signatures relating to cell types and phenotypes. Lastly, it offers a suite of tools for data retrieval, integration, annotation, and mapping, allowing users to construct customized atlases and perform integrated analysis with their own data. These tools are also available in a standalone R package for offline analysis.
What problem does this paper attempt to address?