Graph Node Classification to Predict Autism Risk in Genes

Danushka Bandara,Kyle Riccardi
DOI: https://doi.org/10.3390/genes15040447
IF: 4.141
2024-04-02
Genes
Abstract:This study explores the genetic risk associations with autism spectrum disorder (ASD) using graph neural networks (GNNs), leveraging the Sfari dataset and protein interaction network (PIN) data. We built a gene network with genes as nodes, chromosome band location as node features, and gene interactions as edges. Graph models were employed to classify the autism risk associated with newly introduced genes (test set). Three classification tasks were undertaken to test the ability of our models: binary risk association, multi-class risk association, and syndromic gene association. We tested graph convolutional networks, Graph Sage, graph transformer, and Multi-Layer Perceptron (Baseline) architectures on this problem. The Graph Sage model consistently outperformed the other models, showcasing its utility in classifying ASD-related genes. Our ablation studies show that the chromosome band location and protein interactions contain useful information for this problem. The models achieved 85.80% accuracy on the binary risk classification, 81.68% accuracy on the multi-class risk classification, and 90.22% on the syndromic classification.
genetics & heredity
What problem does this paper attempt to address?