Knowledge Graph Embeddings in the Biomedical Domain: Are They Useful? A Look at Link Prediction, Rule Learning, and Downstream Polypharmacy Tasks

Aryo Pradipta Gema,Dominik Grabarczyk,Wolf De Wulf,Piyush Borole,Javier Antonio Alfaro,Pasquale Minervini,Antonio Vergari,Ajitha Rajan

2023-08-31

Abstract:Knowledge graphs are powerful tools for representing and organising complex biomedical data. Several knowledge graph embedding algorithms have been proposed to learn from and complete knowledge graphs. However, a recent study demonstrates the limited efficacy of these embedding algorithms when applied to biomedical knowledge graphs, raising the question of whether knowledge graph embeddings have limitations in biomedical settings. This study aims to apply state-of-the-art knowledge graph embedding models in the context of a recent biomedical knowledge graph, BioKG, and evaluate their performance and potential downstream uses. We achieve a three-fold improvement in terms of performance based on the HITS@10 score over previous work on the same biomedical knowledge graph. Additionally, we provide interpretable predictions through a rule-based method. We demonstrate that knowledge graph embedding models are applicable in practice by evaluating the best-performing model on four tasks that represent real-life polypharmacy situations. Results suggest that knowledge learnt from large biomedical knowledge graphs can be transferred to such downstream use cases. Our code is available at <a class="link-external link-https" href="https://github.com/aryopg/biokge" rel="external noopener nofollow">this https URL</a>.

Machine Learning,Artificial Intelligence

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper aims to explore the effectiveness and application potential of Knowledge Graph Embeddings (KGE) in the biomedical field. Specifically, the paper focuses on the following aspects: 1. **Evaluating the performance of KGE in biomedical knowledge graphs**: - The authors use the latest biomedical knowledge graph, BioKG, to evaluate the performance of various state-of-the-art KGE models in link prediction tasks. - By comparing with previous research results, the paper demonstrates the improvements of these models in biomedical knowledge graphs. 2. **Exploring the application of KGE in downstream tasks**: - The study investigates whether pre-trained KGE models can be effectively transferred to 4 practical polypharmacology tasks, verifying the feasibility and effectiveness of KGE models in real-world applications. 3. **Improving the interpretability of KGE models**: - A rule-based learning model (AnyBURL) is introduced to provide interpretable prediction results, which is particularly important in the biomedical field. ### Main Contributions 1. **Performance Improvement**: - The best KGE model (ComplEx) achieved significant performance improvements in HITS@10 and Mean Reciprocal Rank (MRR) metrics compared to previous work. For example, the HITS@10 of ComplEx increased from 0.286 to 0.793. 2. **Interpretability of Rule Learning**: - The AnyBURL model not only achieved a competitive HITS@10 score (0.677) but also provided interpretable rules that help understand the prediction results. 3. **Application in Downstream Tasks**: - The pre-trained KGE models performed excellently in 4 polypharmacology tasks, validating the feasibility of the transfer learning paradigm. Especially for tasks with less data (such as DPI-FDA), the pre-trained models significantly improved performance and training efficiency. ### Conclusion Through comprehensive evaluation and experiments, the paper demonstrates the effectiveness and potential application value of KGE models in biomedical knowledge graphs. Particularly in link prediction and downstream tasks, the pre-trained KGE models showed significant advantages, providing new tools and methods for biomedical research.

Knowledge Graph Embeddings in the Biomedical Domain: Are They Useful? A Look at Link Prediction, Rule Learning, and Downstream Polypharmacy Tasks

Application and evaluation of knowledge graph embeddings in biomedical data

Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings

Understanding the Performance of Knowledge Graph Embeddings in Drug Discovery

Biomedical Knowledge Graph Refinement and Completion using Graph Representation Learning and Top-K Similarity Measure

The Role of Graph Topology in the Performance of Biomedical Knowledge Graph Completion Models

BioBLP: A Modular Framework for Learning on Multimodal Biomedical Knowledge Graphs

Path-based reasoning in biomedical knowledge graphs

Novel Perspectives and Applications of Knowledge Graph Embeddings: From Link Prediction to Risk Assessment and Explainability

Drug Similarity and Link Prediction Using Graph Embeddings on Medical Knowledge Graphs

Biomedical Multi-hop Question Answering Using Knowledge Graph Embeddings and Language Models

Know2BIO: A Comprehensive Dual-View Benchmark for Evolving Biomedical Knowledge Graphs

Task-driven knowledge graph filtering improves prioritizing drugs for repurposing

THE CORONARY PROFILE

Predicting biomedical relationships using the knowledge and graph embedding cascade model

A Physical Embedding Model for Knowledge Graphs

A Survey on Knowledge Graph Embeddings for Link Prediction

PT-KGNN: A framework for pre-training biomedical knowledge graphs with graph neural networks

Biological Insights Knowledge Graph: an integrated knowledge graph to support drug development

A Survey of Knowledge Graph Embedding and Their Applications

Ensembles of knowledge graph embedding models improve predictions for drug discovery