Abstract:Graph embedding plays an important role in the analysis and study of typical non-Euclidean data, such as graphs. Graph embedding aims to transform complex graph structures into vector representations for further machine learning or data mining tasks. It helps capture relationships and similarities between nodes, providing better representations for various tasks on graphs. Different orders of neighbors have different impacts on the generation of node embedding vectors. Therefore, this paper proposes a multi-order adjacency view encoder to fuse the feature information of neighbors at different orders. We generate different node views for different orders of neighbor information, consider different orders of neighbor information through different views, and then use attention mechanisms to integrate node embeddings from different views. Finally, we evaluate the effectiveness of our model through downstream tasks on the graph. Experimental results demonstrate that our model achieves improvements in attributed graph clustering and link prediction tasks compared to existing methods, indicating that the generated embedding representations have higher expressiveness.
What problem does this paper attempt to address?
This paper attempts to address the problem of effectively integrating information from different-order neighbors in attributed graph embeddings. Specifically, most existing methods tend to overlook the varying impacts of different-order neighbors on the generation of node embedding vectors. For central nodes in a graph, lower-order neighbors have a greater influence, while for peripheral nodes, higher-order neighbors have a greater impact. Therefore, this paper proposes a new encoder (DGA, Different-Order-View Graph Auto-Encoder) that integrates information from different-order neighbors through multi-order adjacency views and an attention mechanism to generate more expressive node embedding vectors.
### Main Contributions
1. **Multi-Order Adjacency View Encoder**: DGA constructs adjacency matrices using multi-order adjacency views and utilizes an attention aggregator to effectively help nodes in the graph aggregate information from different views.
2. **Multi-Layer Perceptron Decoder and Inner Product Decoder**: It decodes both the adjacency matrix \( A \) and the feature matrix \( X \), and optimizes the model by jointly optimizing the reconstruction loss of the adjacency matrix, the reconstruction loss of the feature matrix, and the self-supervised clustering loss.
3. **Experimental Validation**: The learned vectors are applied to graph clustering and link prediction tasks, and experimental results show that the model performs excellently on both tasks.
### Solution
- **Constructing Multi-Order Adjacency Views**: Construct 0-order, 1-order, 2-order, and 3-order adjacency views, considering the node's own information, 1-order neighbor information, 2-order neighbor information, and 3-order neighbor information, respectively.
- **Learning Embedding Vectors from Multi-Order Adjacency Views**: Use a two-layer Graph Convolutional Network (GCN) to learn embedding representations under different views.
- **Fusing Embedding Vectors**: Use an attention aggregator to combine embedding vectors from different views to generate the final embedding representation.
- **Decoder Design**: Use a Multi-Layer Perceptron (MLP) decoder and an inner product decoder to decode the feature matrix and the adjacency matrix, respectively.
- **Loss Function**: Jointly optimize the reconstruction loss of the adjacency matrix, the reconstruction loss of the feature matrix, and the self-supervised clustering loss.
### Experimental Results
- **Node Clustering Task**: Experimental results on the Cora, Citeseer, and Pubmed datasets show that DGA outperforms existing methods in terms of ACC, NMI, and ARI metrics.
- **Link Prediction Task**: In the link prediction task on the same datasets, DGA also demonstrates better performance, especially in terms of AUC and AP metrics.
In summary, this paper effectively addresses the integration of different-order neighbor information in attributed graph embeddings by introducing multi-order adjacency views and an attention mechanism, enhancing the expressiveness of node embedding vectors, and achieving significant performance improvements in multiple downstream tasks.