CellCover Captures Neural Stem Cell Progression in Mammalian Neocortical Development

Lanlan Ji,An Wang,Shreyash Sonthalia,Daniel Q Naiman,Laurent Younes,Carlo Colantuoni,Donald Geman
DOI: https://doi.org/10.1101/2023.04.06.535943
2024-08-18
Abstract:Definition of cell classes across the tissues of living organisms is central in the analysis of growing atlases of single-cell RNA sequencing (scRNA-seq) data across biomedicine. Marker genes for cell classes are most often defined by differential expression (DE) methods that serially assess individual genes across landscapes of diverse cells. This serial approach has been extremely useful, but is limited because it ignores possible redundancy or complementarity across genes that can only be captured by analyzing multiple genes simultaneously. We aim to identify discriminating panels of genes. To efficiently explore the vast space of possible marker panels, leverage the large number of cells often sequenced, and overcome zero-inflation in scRNA-seq data, we propose viewing gene panel selection as a variation of the "minimal set-covering problem" in combinatorial optimization. We show that this new method, CellCover, captures cell-class-specific signals in the developing mouse neocortex that are distinct from those defined by DE methods. Transfer learning experiments across mouse, primate, and human data demonstrate that CellCover identifies markers of conserved cell classes in neurogenesis, as well as temporal progression in both progenitors and neurons. Exploring markers of human outer radial glia (oRG, or basal RG) across mammals, we show that transcriptomic elements of this key cell type in the expansion of the human cortex appeared in gliogenic precursors of the rodent before the full program emerged in the primate lineage. We have assembled the public datasets we use in this report at NeMO analytics where the expression of individual genes {NeMO Individual Genes} and marker gene panels can be freely explored {NeMO: Telley 3 Sets Covering Panels}, {NeMO: Telley 12 Sets Covering Panels}, and {NeMO: Sorted Brain Cell Covering Panels}. CellCover is available in {CellCover R} and {CellCover Python}.
Genomics
What problem does this paper attempt to address?