transcriptome dissection of neocortical excitatory neurogenesis via joint matrix decomposition and transfer learning

Shreyash Sonthalia,Guangyan Li,Xoel Mato Blanco,Alex Casella,Jinrui Liu,Genevieve Stein-O’Brien,Brian Caffo,Ricky S. Adkins,Joshua Orvis,Ronna Hertzano,Anup Mahurkar,Jesse Gillis,Jonathan Werner,Shaojie Ma,Nicola Micali,Nenad Sestan,Pasko Rakic,Gabriel Santpere,Seth A. Ament,Carlo Colantuoni
DOI: https://doi.org/10.1101/2024.02.26.581612
2024-02-28
Abstract:The rising quality and amount of multi-omic data across biomedical science demands that we build innovative solutions to harness their collective discovery potential. From publicly available repositories, we have assembled and curated a compendium of gene-level transcriptomic data focused on mammalian excitatory neurogenesis in the neocortex. This collection is open for exploration by both computational and cell biologists at , and this report forms a demonstration of its utility. Applying our novel structured joint decomposition approach to mouse, macaque and human data from the collection, we define transcriptome dynamics that are conserved across mammalian excitatory neurogenesis and which map onto the genetics of human brain structure and disease. Leveraging additional data within NeMO Analytics via projection methods, we chart the dynamics of these fundamental molecular elements of neurogenesis across developmental time and space and into postnatal life. Reversing the direction of our investigation, we use transcriptomic data from laminar-specific dissection of adult human neocortex to define molecular signatures specific to excitatory neuronal cell types resident in individual layers of the mature neocortex, and trace their emergence across development. We show that while many lineage defining transcription factors are most highly expressed at early fetal ages, the laminar neuronal identities which they drive take years to decades to reach full maturity. Finally, we interrogated data from stem-cell derived cerebral organoid systems demonstrating that many fundamental elements of development are recapitulated with high-fidelity , while specific transcriptomic programs in neuronal maturation are absent. We propose these analyses as specific applications of the general approach of combining joint decomposition with large curated collections of analysis-ready multi-omics data matrices focused on particular cell and disease contexts. Importantly, these open environments are accessible to, and must be fueled with emerging data by, cell biologists with and without coding expertise.
Neuroscience
What problem does this paper attempt to address?
This paper aims to address the challenges posed by the increasing quality and quantity of multi - omics data in biomedical research, especially how to effectively utilize these data to explore the transcriptome dynamics of neocortical excitatory neurogenesis in mammals. Specifically, the authors collected and collated a set of gene - level transcriptome data focused on neocortical excitatory neurogenesis in mammals from public data repositories through the methods of joint matrix decomposition and transfer learning. The goals of this study are: 1. **Define cross - species - conserved transcriptome dynamics**: By applying mouse, macaque, and human data, define the transcriptome dynamics conserved during mammalian excitatory neurogenesis and map them to the genetics of human brain structure and disease. 2. **Map the dynamics of neurogenic molecular elements**: Using additional data in NeMO Analytics, map the dynamic changes of these basic molecular elements in developmental time and space through projection methods until postnatal life. 3. **Explore the emergence and development of neuron - specific - layer features**: Use the transcriptome data obtained from the stratified - specific anatomy of the adult human neocortex to define the molecular features specific to excitatory neuron cell types in each layer of the mature neocortex and track the emergence of these features throughout the development process. 4. **Evaluate the reproducibility of in - vitro models**: By analyzing data from stem - cell - derived brain organoid systems, show that many basic elements of in - vivo development can be reproduced with high fidelity in - vitro, but certain specific transcriptome programs for neuron maturation are missing. In summary, this paper attempts to address how to effectively use these data to gain an in - depth understanding of the molecular mechanisms of neocortical excitatory neurogenesis in mammals and their conservation across different species, as well as the potential roles of these mechanisms in human brain diseases, by combining joint matrix decomposition with large collated multi - omics datasets. This not only helps to improve the understanding of the neurogenic process but also provides a new perspective for the diagnosis and treatment of related diseases.