A General Framework for Representing and Annotating Multifaceted Cell Heterogeneity in Human Cell Atlas

Haoxiang Gao,Kui Hua,Sijie Chen,Qijin Yin,Rui Jiang,Xuegong Zhang
DOI: https://doi.org/10.1101/2021.09.09.459281
2021-01-01
Abstract:The goal of big projects like Human Cell Atlas (HCA) and Human BioMedical Atlas Program (HuBMAP) is to build maps that comprehensively define and describe all cell types and their molecular features in a healthy human being. Just like geographical maps must have coordinates, a key task in building cell maps is to provide coordinate systems for cells. A well-designed coordinate system helps better understand the highly orchestrated function and organization of different cells. Cells could be depicted by external information like their spatial locations in the body and organ, the sex and race of the donor, and multiple endogenous attributes of cells such as their types, states, functions, developing trajectory, etc. These heterogeneities are encoded in or can be predicted with transcriptomics and other omics data. Cell heterogeneities are multifaceted, including three major types: continuous values or scores, categorical groups and structured annotations. Here we propose to a unified multidimensional coordinate system UniCoord to represent the multifaceted heterogeneities of cells. It is based on a general deep learning framework, with a supervised VAE structure to learn the mapping relationship between gene expressions and the generated coordinates in a low-dimensional space that encode multiple cell attributes of the three types. Experiment results on several datasets showed that UniCoord was able to represent a variety of cell heterogeneous properties that are discrete, continuous or of hierarchical structures. The trained UniCoord model can be used to automatically label attributes of cells and generate the corresponding expression data. Experiments showed that UniCoord is a feasible coordinates framework for representing multifaceted cell heterogeneity in comprehensive cell atlases. ### Competing Interest Statement The authors have declared no competing interest.
What problem does this paper attempt to address?