From Primes to Paths: Enabling Fast Multi-Relational Graph Analysis

Konstantinos Bougiatiotis,Georgios Paliouras
2024-11-18
Abstract:Multi-relational networks capture intricate relationships in data and have diverse applications across fields such as biomedical, financial, and social sciences. As networks derived from increasingly large datasets become more common, identifying efficient methods for representing and analyzing them becomes crucial. This work extends the Prime Adjacency Matrices (PAMs) framework, which employs prime numbers to represent distinct relations within a network uniquely. This enables a compact representation of a complete multi-relational graph using a single adjacency matrix, which, in turn, facilitates quick computation of multi-hop adjacency matrices. In this work, we enhance the framework by introducing a lossless algorithm for calculating the multi-hop matrices and propose the Bag of Paths (BoP) representation, a versatile feature extraction methodology for various graph analytics tasks, at the node, edge, and graph level. We demonstrate the efficiency of the framework across various tasks and datasets, showing that simple BoP-based models perform comparably to or better than commonly used neural models while offering improved speed and interpretability.
Machine Learning,Social and Information Networks
What problem does this paper attempt to address?
This paper attempts to solve the problem of efficient representation and analysis in multi - relational graphs, especially when dealing with large - scale data sets. Specifically, the author proposes a new framework to represent multi - relational graphs in a compact and lossless manner and quickly calculate multi - hop adjacency matrices. The following are the main problems and solutions in the paper: ### Research Background and Problems As networks derived from large - scale data sets become more and more common, how to efficiently represent and analyze these complex multi - relational networks has become a key issue. Traditional multi - relational network representation methods usually only focus on the direct relationships between entities, ignoring the rich multi - hop connection information in the graph. This is a significant limitation in many fields (such as explainable artificial intelligence and molecular chemistry), because understanding the paths between entities can reveal the true nature of their relationships and the role of each entity in the graph. ### Proposed Solutions To solve the above problems, the author extends the Prime Adjacency Matrices (PAMs) framework. The main contributions include: 1. **Lossless Algorithm**: A lossless algorithm is proposed to calculate multi - hop matrices, ensuring that no information is lost when extracting multi - hop paths. 2. **Bag of Paths (BoP)**: A feature extraction method named "Bag of Paths" is introduced to generate feature vectors suitable for node, edge and graph levels, and these feature vectors have good interpretability. 3. **Efficient Implementation**: The implementation of the framework is optimized using GraphBLAS, which greatly improves the computational efficiency and makes it available as a Python module for further experiments. ### Method Overview - **Prime Adjacency Matrix (PAM)**: By mapping each relationship to a different prime number, a single adjacency matrix is constructed to represent the entire multi - relational graph. This matrix can recover all the information of the original graph through prime factor decomposition. - **Multi - Hop Path Calculation**: By defining a lossless path expansion and aggregation process, multi - hop paths of any length can be accurately calculated. - **Feature Extraction**: Based on PAMs, a simple feature extraction method (BoP) is designed to generate feature vectors required for different tasks. ### Application and Verification The author verifies the effectiveness of this framework on multiple tasks and data sets, showing that the simple model based on BoP can be comparable to or even better than commonly used neural network models in performance, while providing faster speed and better interpretability. In conclusion, this paper aims to provide an efficient multi - relational graph analysis framework, especially suitable for tasks that need to capture multi - hop information, and verifies its effectiveness and superiority through theory and experiment.