PGCN: Disease gene prioritization by disease and gene embedding through graph convolutional neural networks
Yu Li,Hiroyuki Kuwahara,Peng Yang,Le Song,Xin Gao
DOI: https://doi.org/10.1101/532226
2019-01-28
Abstract:ABSTRACT Motivation Proper prioritization of candidate genes is essential to the genome-based diagnostics of a range of genetic diseases. However, it is a highly challenging task involving limited and noisy knowledge of genes, diseases and their associations. While a number of computational methods have been developed for the disease gene prioritization task, their performance is largely limited by manually crafted features, network topology, or pre-defined rules of data fusion. Results Here, we propose a novel graph convolutional network-based disease gene prioritization method, PGCN, through the systematic embedding of the heterogeneous network made by genes and diseases, as well as their individual features. The embedding learning model and the association prediction model are trained together in an end-to-end manner. We compared PGCN with five state-of-the-art methods on the Online Mendelian Inheritance in Man (OMIM) dataset for tasks to recover missing associations and discover associations between novel genes and diseases. Results show significant improvements of PGCN over the existing methods. We further demonstrate that our embedding has biological meaning and can capture functional groups of genes. Availability The main program and the data are available at https://github.com/lykaust15/Disease_gene_prioritization_GCN .
What problem does this paper attempt to address?