Enhancing Embedding Representations of Biomedical Data using Logic Knowledge

Michelangelo Diligenti,Francesco Giannini,Stefano Fioravanti,Caterina Graziani,Moreno Falaschi,Giuseppe Marra
2023-03-23
Abstract:Knowledge Graph Embeddings (KGE) have become a quite popular class of models specifically devised to deal with ontologies and graph structure data, as they can implicitly encode statistical dependencies between entities and relations in a latent space. KGE techniques are particularly effective for the biomedical domain, where it is quite common to deal with large knowledge graphs underlying complex interactions between biological and chemical objects. Recently in the literature, the PharmKG dataset has been proposed as one of the most challenging knowledge graph biomedical benchmark, with hundreds of thousands of relational facts between genes, diseases and chemicals. Despite KGEs can scale to very large relational domains, they generally fail at representing more complex relational dependencies between facts, like logic rules, which may be fundamental in complex experimental settings. In this paper, we exploit logic rules to enhance the embedding representations of KGEs on the PharmKG dataset. To this end, we adopt Relational Reasoning Network (R2N), a recently proposed neural-symbolic approach showing promising results on knowledge graph completion tasks. An R2N uses the available logic rules to build a neural architecture that reasons over KGE latent representations. In the experiments, we show that our approach is able to significantly improve the current state-of-the-art on the PharmKG dataset. Finally, we provide an ablation study to experimentally compare the effect of alternative sets of rules according to different selection criteria and varying the number of considered rules.
Artificial Intelligence,Machine Learning,Quantitative Methods
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the following issues: 1. **Enhancing Embedding Representations**: Utilizing logical rules to enhance the representation capabilities of Knowledge Graph Embeddings (KGE) in large-scale biomedical knowledge graphs. Specifically, by applying Relational Reasoning Networks (R2N) on the PharmKG dataset, the paper aims to improve existing methods. 2. **Handling Complex Dependencies**: While existing KGE methods can handle very large relational domains, they fall short in representing complex dependencies between facts (such as first-order logical rules). The paper attempts to overcome this limitation by incorporating logical rules. 3. **Improving Prediction Performance**: Significantly improving the prediction performance of the current best methods on large-scale biomedical knowledge graphs (PharmKG) through R2N, and validating the effects of different rule sets through ablation studies. 4. **Automatic Rule Mining**: Demonstrating how to utilize automatic rule mining techniques to enrich the knowledge graph completion task and evaluating the performance of R2N under different rule mining settings.