Heterogeneous Graph based Deep Learning for Biomedical Network Link Prediction

Jinjiang Guo,Jie Li,Dawei Leng,Lurong Pan
2022-02-24
Abstract:Multi-scale biomedical knowledge networks are expanding with emerging experimental technologies that generates multi-scale biomedical big data. Link prediction is increasingly used especially in bipartite biomedical networks to identify hidden biological interactions and relationshipts between key entities such as compounds, targets, gene and diseases. We propose a Graph Neural Networks (GNN) method, namely Graph Pair based Link Prediction model (GPLP), for predicting biomedical network links simply based on their topological interaction information. In GPLP, 1-hop subgraphs extracted from known network interaction matrix is learnt to predict missing links. To evaluate our method, three heterogeneous biomedical networks were used, i.e. Drug-Target Interaction network (DTI), Compound-Protein Interaction network (CPI) from NIH Tox21, and Compound-Virus Inhibition network (CVI). Our proposed GPLP method significantly outperforms over the state-of-the-art baselines. In addition, different network incompleteness is analysed with our devised protocol, and we also design an effective approach to improve the model robustness towards incomplete networks. Our method demonstrates the potential applications in other biomedical networks.
Social and Information Networks,Artificial Intelligence,Information Retrieval,Machine Learning
What problem does this paper attempt to address?
The main objective of this paper is to propose a method based on Graph Neural Networks (GNN) to predict potential or missing links in biomedical networks. Specifically, the authors developed a method called Graph Pair based Link Prediction (GPLP), which starts from the known interaction information in biomedical networks and predicts links by extracting the first-order neighborhood subgraphs around nodes. The key issues addressed in the paper include: 1. **Expansion of multi-scale biomedical networks**: With the development of experimental techniques, a large amount of multi-scale biomedical big data has been generated. These data form complex networks connecting multiple levels of entities such as drugs, targets, genes, and diseases. 2. **Application of link prediction in biomedical networks**: Link prediction is widely used to identify hidden biological interactions and relationships, especially in bipartite biomedical networks, such as drug-target interaction networks, compound-protein interaction networks, and compound-virus inhibition networks, etc. 3. **Utilizing Graph Neural Networks for link prediction**: The paper proposes the GPLP method, which extracts first-order subgraphs from known network interaction matrices and uses these subgraphs to predict missing links. This method does not require additional node attribute information and relies solely on the network's topological structure. Through experimental validation on three different biomedical network datasets, the GPLP method significantly outperforms existing baseline methods. In addition, the authors analyzed the scenarios of different network incompleteness and designed an effective strategy to improve the model's robustness to incomplete networks. In summary, this study provides a new perspective to reveal the potential interaction mechanisms of complex biomedical systems and demonstrates its potential application value in real-world biomedical scenarios.