Metapaths guided Neighbors aggregated Network for?Heterogeneous Graph Reasoning

Bang Lin,Xiuchong Wang,Yu Dong,Chengfu Huo,Weijun Ren,Chuanyu Xu
DOI: https://doi.org/10.48550/arXiv.2103.06474
2021-03-11
Abstract:Most real-world datasets are inherently heterogeneous graphs, which involve a diversity of node and relation types. Heterogeneous graph embedding is to learn the structure and semantic information from the graph, and then embed it into the low-dimensional node representation. Existing methods usually capture the composite relation of a heterogeneous graph by defining metapath, which represent a semantic of the graph. However, these methods either ignore node attributes, or discard the local and global information of the graph, or only consider one metapath. To address these limitations, we propose a Metapaths-guided Neighbors-aggregated Heterogeneous Graph Neural Network(MHN) to improve performance. Specially, MHN employs node base embedding to encapsulate node attributes, BFS and DFS neighbors aggregation within a metapath to capture local and global information, and metapaths aggregation to combine different semantics of the heterogeneous graph. We conduct extensive experiments for the proposed MHN on three real-world heterogeneous graph datasets, including node classification, link prediction and online A/B test on Alibaba mobile application. Results demonstrate that MHN performs better than other state-of-the-art baselines.
Artificial Intelligence,Machine Learning,Social and Information Networks
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to perform node embedding learning more effectively in a heterogeneous graph. Specifically, the existing heterogeneous graph embedding methods have the following limitations: 1. **Ignoring node attributes**: Some methods do not fully utilize the attribute information of nodes, resulting in the generated embedding vectors lacking rich information. 2. **Losing local and global information**: Some methods fail to fully capture the local and global information of nodes, which is crucial for generating high - quality node embeddings. 3. **Only considering a single metapath**: Although nodes in different semantics have different meanings, some methods only use a single metapath to embed the heterogeneous graph, ignoring the importance of the multi - semantic space. To solve these problems, the authors propose a Metapaths - guided Neighbors - aggregated Heterogeneous Graph Neural Network (MHN). MHN improves the existing methods in the following ways: - **Node - based embedding**: Project the attributes of different types of nodes into the same latent vector space through attribute transformation, thereby fusing node ID and attribute information. - **Intra - metapath aggregation**: Under the guidance of a single metapath, extract local and global information through breadth - first search (BFS) and depth - first search (DFS) neighbor aggregations, and use the attention mechanism to weighted - sum these information. - **Inter - metapath aggregation**: Use the attention mechanism to fuse the embedding vectors from multiple metapaths to capture the comprehensive semantic information in the heterogeneous graph. ### Formula summary 1. **Node - based embedding**: - Obtain the embedding vector from the node ID: \[ h_{\text{id}}^u = W_e\cdot u \] - Obtain the embedding vector from the node attribute: \[ h_{\text{att}}^u = W_A\cdot x_u \] - Combine ID and attribute information: \[ h_u=\text{pooling}(h_{\text{id}}^u, h_{\text{att}}^u) \] 2. **Intra - metapath aggregation**: - BFS neighbor aggregation: \[ h_{\text{BFS}}^{u, p_i}=f_\theta(h_v, v\in N_{p_i}^u) \] where \(f_\theta\) can be mean encoding, weighted encoding or nonlinear encoding. - DFS neighbor aggregation: \[ h_{\text{DFS}}^{u, p_i}=f_\theta(h_v, v\in M_{p_i}^u) \] - Use the attention mechanism to weighted - sum: \[ h_{p_i}^u=\alpha_1\cdot h_{\text{BFS}}^{u, p_i}+\alpha_2\cdot h_{\text{DFS}}^{u, p_i} \] where \(\alpha_1\) and \(\alpha_2\) are the weights calculated by the attention mechanism. 3. **Inter - metapath aggregation**: - Calculate the importance weight of each metapath: \[ \beta_{p_i}=\frac{\exp(e_{p_i})}{\sum_{p\in P}\exp(e_p)} \] - Fuse the embeddings of all metapaths: \[ h_u=\sum_{p_i\in P}\beta_{p_i}\cdot h_{p_i}^u \] 4. **Final embedding**: - Apply a fully - connected layer to enhance non -