SPGNN: Recognizing Salient Subgraph Patterns via Enhanced Graph Convolution and Pooling

Zehao Dong,Muhan Zhang,Yixin Chen
2024-04-30
Abstract:Graph neural networks (GNNs) have revolutionized the field of machine learning on non-Euclidean data such as graphs and networks. GNNs effectively implement node representation learning through neighborhood aggregation and achieve impressive results in many graph-related tasks. However, most neighborhood aggregation approaches are summation-based, which can be problematic as they may not be sufficiently expressive to encode informative graph structures. Furthermore, though the graph pooling module is also of vital importance for graph learning, especially for the task of graph classification, research on graph down-sampling mechanisms is rather limited.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The main problems that this paper attempts to solve include: 1. **Insufficient expressive power of neighborhood aggregation methods in existing graph neural networks (GNNs)**: Most existing neighborhood aggregation methods are sum - based, which may lead to an inability to fully express the complex structural information in the graph. Specifically, these methods perform poorly in distinguishing non - isomorphic sub - graphs because they cannot effectively encode specific properties such as the connections and positions between nodes. 2. **Limited research on graph pooling modules**: Although graph pooling modules are very important for graph learning tasks (especially graph classification tasks), currently, there is relatively little research on graph down - sampling mechanisms. Traditional graph pooling methods, such as sum - based or sorting - based methods, have limitations in extracting hierarchical graph representations and it is difficult to capture the importance of sub - trees at different depths. To address the above challenges, the author proposes the following solutions: - **Introduce a connection - based graph convolution mechanism**: Maximize the ability to distinguish non - isomorphic sub - graphs by injecting updates to node representations. This method can better preserve the topological structure of the graph and node information. - **Design a new graph pooling module WL - SortPool**: This module sorts the node representations layer by layer (i.e., continuous Weisfeiler - Lehman colors) to learn the relative importance of sub - trees at different depths respectively, thereby better representing the complex graph topology and rich graph - encoding information. Finally, the author proposes a new graph neural network architecture - Subgraph Pattern GNN (SPGNN), and tests it on multiple graph classification benchmark datasets. The experimental results show that SPGNN achieves performance comparable to or even better than the existing state - of - the - art methods on graph classification tasks. ### Formula Summary 1. **Graph Convolution Operation**: \[ Z^{(l + 1)}=f(t(A)Z^{(l)}W^{(l)}) \] where \( Z^{(l)}\in\mathbb{R}^{n\times d_{l}} \) represents the node representation matrix of the \( l\) - th layer, \( W^{(l)}\in\mathbb{R}^{d_{l}\times d_{l + 1}} \) is the trainable parameter matrix for feature transformation, and \( t(A)\) is the neighborhood aggregator based on the adjacency matrix \( A\). 2. **Cat - Agg Aggregator**: \[ \text{Cat - Agg}(S_{i}, S)=\sigma(M^{T}\text{concat}(S_{i},\text{sumpool}(\{S_{j}|j\neq i\}))) \] where \( M\in\mathbb{R}^{2d\times d_{\text{out}}} \) is a trainable parameter matrix that provides sufficient expressive power, and \(\sigma\) is a non - linear activation function. 3. **WL - SortPool Operation**: \[ Z_{G}=\text{concat}(\text{sort - k}(H^{(l)})|l = 1,2,\ldots,L) \] where \( H^{(l)}\) is the node representation matrix of the \( l\) - th layer, which is sorted and truncated after learning the node importance through MLP. Through these improvements, SPGNN performs excellently in graph classification tasks, can effectively identify important sub - graph patterns, and maintains the global graph topological information.