Prioritizing Autism Risk Genes Using Personalized Graphical Models Estimated From Single-Cell RNA-seq Data

Jianyu Liu,Haodong Wang,Wei Sun,Yufeng Liu
DOI: https://doi.org/10.1080/01621459.2021.1933495
IF: 4.369
2021-07-21
Journal of the American Statistical Association
Abstract:Hundreds of autism risk genes have been reported recently, mainly based on genetic studies where these risk genes have more de novo mutations in autism subjects than healthy controls. However, as a complex disease, autism is likely associated with more risk genes and many of them may not be identifiable through de novo mutations. We hypothesize that more autism risk genes can be identified through their connections with known autism risk genes in personalized gene–gene interaction graphs. We estimate such personalized graphs using single-cell RNA sequencing (scRNA-seq) while appropriately modeling the cell dependence and possible zero-inflation in the scRNA-seq data. The sample size, which is the number of cells per individual, ranges from 891 to 1241 in our case study using scRNA-seq data in autism subjects and controls. We consider 1500 genes in our analysis. Since the number of genes is larger or comparable to the sample size, we perform penalized estimation. We score each gene's relevance by applying a simple graph kernel smoothing method to each personalized graph. The molecular functions of the top-scored genes are related to autism diseases. For example, a candidate gene RYR2 that encodes protein ryanodine receptor 2 is involved in neurotransmission, a process that is impaired in ASD patients. While our method provides a systemic and unbiased approach to prioritize autism risk genes, the relevance of these genes needs to be further validated in functional studies. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
statistics & probability
What problem does this paper attempt to address?