Fast and lightweight cell atlas approximations across organs and organisms

Ying Xu,Joanna Ahn,Fabio Zanini
DOI: https://doi.org/10.1101/2024.01.03.573994
2024-01-03
Abstract:Omic technologies at single-cell resolution are reshaping our understanding of cellular diversity. The generation of cell atlases that capture the cellular composition of an entire individual is progressing rapidly. However, the science of organising and extracting information from these atlases is still in its infancy and for many biomedical researchers atlas exploration remains challenging. Here, we leveraged extensive experience in single-cell data analytics to pinpoint three major accessibility barriers to cell atlases, related to (i) programming skill or language, (ii) scalability, and (iii) dissemination standards. To help researchers overcome these barriers, we developed cell atlas approximations, a computational approach enabling the analysis of cell atlases across organs and organisms without programming skills, rapidly, and at scale. The web interface at facilitates the exploration of cell atlases in 19 species across the tree of life through a chatbot driven by frontend natural language processing. In parallel, application programming interfaces streamline data access for computational researchers and include specialised packages for Python, R, JavaScript, and Bash. Supported queries include marker gene identification, cross-organ comparisons, cell embeddings, gene sequences, searches for similar features, and bidirectional zoom between cell types and cell states. Most queries are answered in less than 1.5 seconds thanks to lossy data compression algorithms based on cell annotations and similarity graphs. Compared to traditional cell atlas analysis, this approach can reduce data size by more than 100 times and accelerate workflows by up to 100,000 times. Atlas approximations aim to make the exploration of cell atlases accessible to anyone in the world.
Cell Biology
What problem does this paper attempt to address?