LncRNA-Disease Association Prediction Based on Graph Neural Networks and Inductive Matrix Completion

Lin Yuan,Jing Zhao,Tao Sun,Xue-Song Jiang,Zhen-Yu Yang,Xin-Gang Wang,Yu-Shui Geng
DOI: https://doi.org/10.1007/978-3-030-60802-6_23
2020-01-01
Abstract:Emerging evidence indicates that long non-coding RNA (lncRNA) plays a crucial role in human disease. Discovering disease-gene association is a fundamental and critical biomedical task, which assists biologists and physicians to discover complex pathogenic mechanisms under the phenotype. With high-throughput sequencing technology and various clinical biomarkers to measure the similarities between lncRNA and disease phenotype, network-based semi-supervised learning has been commonly utilized by these studies to address this class imbalanced large-scale data issue. However, most existing approaches are based on linear models and suffer from two major limitations: 1) They implicitly consider a local-structure representation for each candidate; 2) They are unable to capture nonlinear associations between lncRNAs and diseases. In this paper, we propose a new framework for lncRNA-disease association task by combining Graph Neural Network (GNN) and inductive matrix completion, named GNN-IMC. With the help of GNN, we could generate subgraphs based on (lncRNA, disease) pairs from the observed association matrix and maps these subgraphs to their corresponding associations. In addition, GNN-IMC is inductive–it can generalize to lncRNAs/diseases unseen during the training (given that their associations exist), and can even transfer to new tasks. Empirical results demonstrate that the proposed deep learning algorithm outperforms all other state-of-the-art methods on most of metrics.
What problem does this paper attempt to address?