Directed Acyclic Graph Neural Networks

Veronika Thost,Jie Chen
DOI: https://doi.org/10.48550/arXiv.2101.07965
2021-02-03
Abstract:Graph-structured data ubiquitously appears in science and engineering. Graph neural networks (GNNs) are designed to exploit the relational inductive bias exhibited in graphs; they have been shown to outperform other forms of neural networks in scenarios where structure information supplements node features. The most common GNN architecture aggregates information from neighborhoods based on message passing. Its generality has made it broadly applicable. In this paper, we focus on a special, yet widely used, type of graphs -- DAGs -- and inject a stronger inductive bias -- partial ordering -- into the neural network design. We propose the \emph{directed acyclic graph neural network}, DAGNN, an architecture that processes information according to the flow defined by the partial order. DAGNN can be considered a framework that entails earlier works as special cases (e.g., models for trees and models updating node representations recurrently), but we identify several crucial components that prior architectures lack. We perform comprehensive experiments, including ablation studies, on representative DAG datasets (i.e., source code, neural architectures, and probabilistic graphical models) and demonstrate the superiority of DAGNN over simpler DAG architectures as well as general graph architectures.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
This paper attempts to address the problem of better handling directed acyclic graphs (DAG) in graph neural networks (GNN). Specifically, the authors propose a new architecture—Directed Acyclic Graph Neural Network (DAGNN), which leverages partial order information in DAGs, thereby outperforming existing general GNN architectures when processing DAGs. ### Main Issues 1. **Limitations of Existing GNNs**: - Existing GNN architectures primarily aggregate neighborhood information through a message-passing mechanism, which may not fully utilize the partial order information unique to DAGs. - This general GNN architecture may be limited by the network's depth when processing DAGs, leading to insufficient information propagation. 2. **Special Properties of DAGs**: - DAGs have partial order, which is a strong inductive bias that can provide more information about the dependencies between nodes. - In DAGs, the information propagation of nodes should follow the partial order rather than simple multi-hop local neighborhood aggregation. ### Solution 1. **DAGNN Architecture**: - DAGNN updates node representations through partial order, i.e., updating node representations sequentially based on the information from all predecessor nodes. - This approach allows nodes without successors to digest the entire graph's information. Compared to traditional message-passing neural networks (MPNN), DAGNN always uses the latest information to update node representations. 2. **Technical Details**: - **Attention Mechanism**: Uses an attention mechanism to aggregate information from predecessor nodes, better capturing dependencies between nodes. - **Multi-layer Structure**: Enhances expressive power through a multi-layer structure, enabling the model to better capture complex graph structures. - **Topological Batch Processing**: To improve computational efficiency, topological batch processing is proposed, maximizing concurrency on parallel computing resources (e.g., GPU). ### Experimental Validation 1. **Datasets**: - DAGs parsed from source code (OGBG-CODE) - Neural Architecture Search (NA) - Score-based Bayesian Network Learning (BN) 2. **Tasks**: - **TOK Task**: Predicting the token of function names. - **LP Task**: Predicting the length of the longest path in a DAG. - **Scoring Task**: Predicting the performance of neural architectures on CIFAR-10 or the BIC score of Bayesian networks. 3. **Experimental Results**: - DAGNN outperforms other DAG architectures and general GNN architectures on both the TOK and LP tasks. - DAGNN also performs well on the scoring task, especially on the NA and BN datasets. ### Conclusion DAGNN provides a more effective method for handling DAGs by leveraging their partial order. Experimental results show that DAGNN outperforms existing DAG architectures and general GNN architectures on multiple tasks, validating its effectiveness and superiority in processing DAGs.