Insights, opportunities and challenges provided by large cell atlases

Martin Hemberg,Federico Marini,Shila Ghazanfar,Ahmad Al Ajami,Najla Abassi,Benedict Anchang,Bérénice A. Benayoun,Yue Cao,Ken Chen,Yesid Cuesta-Astroz,Zach DeBruine,Calliope A. Dendrou,Iwijn De Vlaminck,Katharina Imkeller,Ilya Korsunsky,Alex R. Lederer,Pieter Meysman,Clint Miller,Kerry Mullan,Uwe Ohler,Nikolaos Patikas,Jonas Schuck,Jacqueline HY Siu,Timothy J. Triche Jr.,Alex Tsankov,Sander W. van der Laan,Masanao Yajima,Jean Yang,Fabio Zanini,Ivana Jelic
2024-08-13
Abstract:The field of single-cell biology is growing rapidly and is generating large amounts of data from a variety of species, disease conditions, tissues, and organs. Coordinated efforts such as CZI CELLxGENE, HuBMAP, Broad Institute Single Cell Portal, and DISCO, allow researchers to access large volumes of curated datasets. Although the majority of the data is from scRNAseq experiments, a wide range of other modalities are represented as well. These resources have created an opportunity to build and expand the computational biology ecosystem to develop tools necessary for data reuse, and for extracting novel biological insights. Here, we highlight achievements made so far, areas where further development is needed, and specific challenges that need to be overcome.
Genomics
What problem does this paper attempt to address?
The paper primarily explores the applications, opportunities, and challenges of large-scale cell atlases in the field of single-cell biology. Specifically, the paper attempts to address the following key issues: 1. **Data Integration and Sharing**: With the advancement of single-cell sequencing technology, a large amount of datasets has been generated. How to effectively integrate these data and share them through a unified platform so that researchers can more easily access and analyze these data. 2. **Data Preprocessing and Standardization**: Ensuring the quality and consistency of data is crucial for constructing cell atlases. The paper discusses how to preprocess data from different sources to reduce batch effects and ensure that the data conforms to standard formats. 3. **Metadata and Ontology**: To facilitate the reanalysis of existing datasets, it is necessary to establish comprehensive metadata standards and cell type ontologies. This not only helps in the standardized management of data but also improves the reproducibility and accuracy of data analysis. 4. **Data Integration and Meta-Analysis**: How to effectively integrate data from different studies and experimental conditions to achieve large-scale meta-analysis, thereby revealing new biological insights. This involves handling various confounding factors and technical differences. 5. **Applications in Biomedical Research**: The ultimate goal is to use cell atlases to accelerate research in disease management and treatment. For example, by analyzing the association between specific gene loci and cell types to uncover the molecular mechanisms of complex diseases; and by identifying disease-related cell states to discover potential drug targets. 6. **Integration of New Technologies**: With the development of artificial intelligence and other advanced technologies, how to apply these new technologies to single-cell data analysis to improve research efficiency and accuracy. In summary, this paper aims to provide a comprehensive perspective for the field of single-cell biology by discussing the above issues, thereby promoting further development in this field.