QESK: Quantum-based Entropic Subtree Kernels for Graph Classification

Lu Bai,Lixin Cui,Edwin R. Hancock
DOI: https://doi.org/10.48550/arXiv.2212.05228
2022-12-10
Abstract:In this paper, we propose a novel graph kernel, namely the Quantum-based Entropic Subtree Kernel (QESK), for Graph Classification. To this end, we commence by computing the Average Mixing Matrix (AMM) of the Continuous-time Quantum Walk (CTQW) evolved on each graph structure. Moreover, we show how this AMM matrix can be employed to compute a series of entropic subtree representations associated with the classical Weisfeiler-Lehman (WL) algorithm. For a pair of graphs, the QESK kernel is defined by computing the exponentiation of the negative Euclidean distance between their entropic subtree representations, theoretically resulting in a positive definite graph kernel. We show that the proposed QESK kernel not only encapsulates complicated intrinsic quantum-based structural characteristics of graph structures through the CTQW, but also theoretically addresses the shortcoming of ignoring the effects of unshared substructures arising in state-of-the-art R-convolution graph kernels. Moreover, unlike the classical R-convolution kernels, the proposed QESK can discriminate the distinctions of isomorphic subtrees in terms of the global graph structures, theoretically explaining the effectiveness. Experiments indicate that the proposed QESK kernel can significantly outperform state-of-the-art graph kernels and graph deep learning methods for graph classification problems.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are the three theoretical deficiencies of the existing state - of - the - art R - convolution graph kernels in graph classification tasks: 1. **Ignoring the influence of non - shared sub - structures**: The existing R - convolution graph kernels only focus on the isomorphic sub - structures shared between graphs, while ignoring those unshared sub - structures. This may lead to an inability to accurately reflect the similarity measure between graphs. 2. **Unable to distinguish the differences of isomorphic sub - structures in the global graph structure**: If a pair of graphs has the same number of different sub - structures, the R - convolution graph kernels may be unable to distinguish their inherent structural differences. 3. **Only focusing on local sub - structure information**: Due to relying on graph decomposition, R - convolution graph kernels usually can only reflect local structure information and are difficult to capture the features of the global graph structure. To solve these problems, the author proposes a new Quantum - based Entropic Subtree Kernel (QESK) for graph classification problems. QESK calculates the entropy tree representation of each graph through Continuous - time Quantum Walk (CTQW) and uses these representations to define the similarity measure between graphs. Specifically, QESK can not only better distinguish the differences between isomorphic sub - trees, but also reflect both global and local structure information simultaneously, thus providing a more accurate graph similarity measure. ### Key innovation points of QESK 1. **Entropy tree representation based on CTQW**: Calculate the Average Mixing Matrix (AMM) through CTQW, and use AMM to calculate the entropy tree representation related to the classical Weisfeiler - Lehman (WL) algorithm. These entropy tree representations can not only reflect the structural arrangement differences between isomorphic sub - trees, but also capture the global information of the entire graph structure. 2. **Definition of positive - definite kernel function**: For a pair of graphs \(G_p\) and \(G_q\), the QESK kernel is defined by calculating the exponent of the negative Euclidean distance between their entropy tree representations, ensuring the positive - definiteness of the kernel function: \[ K_{\text{QESK}}(G_p, G_q)=\sum_{I = 1}^{I_{\max}}K_{\text{QESK}}^I(G_p, G_q) \] where, \[ K_{\text{QESK}}^I(G_p, G_q)=\exp\left(-\sqrt{\sum_{x = 1}^{M_I}[E_p(L_I^x)-E_q(L_I^x)]^2}\right) \] 3. **Experimental verification**: The experimental results show that the proposed QESK kernel is significantly superior to the existing graph kernel methods and graph deep - learning methods in graph classification tasks. Through these improvements, QESK effectively overcomes the theoretical deficiencies of the traditional R - convolution graph kernels and provides a more accurate graph similarity measure.