Cell type-specific predictive models for prioritizing genes and gene sets associated with autism spectrum disorder

Jinting Guan,Yang Wang,Yiping Lin,Qingyang Yin,Yibo Zhuang,Guoli Ji
DOI: https://doi.org/10.21203/rs.3.rs-51727/v1
2020-01-01
Abstract:Background Autism spectrum disorder (ASD) is characterized by substantial phenotypic and genetic heterogeneity. Although bulk transcriptomic analyses revealed convergence of disease pathology on common pathways, the brain cell type-specific molecular pathology of ASD is still needed to study. Different gene functions may be dysregulated and causal genes may be distinct among different brain cells in ASD. Gene expression profiling-based machine learning studies can be conducted for the diagnosis of ASD, prioritizing high-confidence gene candidates and promoting the design of effective interventions.Methods To characterize the cell type heterogeneity of ASD and to take advantage of the potential of gene expression signature as diagnostic biomarkers for ASD, we construct multiple kinds of classification models for ASD based on the recently available human brain nucleus gene expression data of ASD and controls. Firstly, we construct cell type-specific predictive models based on individual genes to screen cell type-specific genes associated with ASD. Then from the view of gene set, we construct cell type-specific gene set-based predictive models to screen cell type-specific gene sets associated with ASD. These two kinds of predictive models can be applied to predict the diagnosis of a given nucleus with known cell type. Lastly, we further construct a multi-label predictive model for predicting the cell type and diagnosis of a given nucleus at the same time.Results It is found that the functions of genes with predictive power for ASD are not consistent and the top important genes are distinct among different cells, demonstrating the cell type heterogeneity of ASD. Our findings suggest that layer 2/3 and layer 4 excitatory neurons, layer 5/6 cortico-cortical projection neurons, parvalbumin interneurons, and protoplasmic astrocytes are preferentially affected in ASD. Gene BCYRN1 and CCK are prioritized in excitatory neurons, and HSPA1A is of note in protoplasmic astrocytes.Limitations Our study utilized methods of machine learning to identify biomarkers of ASD, while it is more convincing if subsequent experiments could be conducted to validate the results.Conclusions The results show that it may be feasible to use single cell/nucleus gene expression for ASD detection and the constructed predictive models can promote the diagnosis of ASD. Our analytical pipeline prioritizes ASD-associated cell type-specific genes and gene sets, which may be used as potential biomarkers of ASD.
What problem does this paper attempt to address?