Architectural Implications of GNN Aggregation Programming Abstractions

Yingjie Qi,Jianlei Yang,Ao Zhou,Tong Qiao,Chunming Hu
2023-10-21
Abstract:Graph neural networks (GNNs) have gained significant popularity due to the powerful capability to extract useful representations from graph data. As the need for efficient GNN computation intensifies, a variety of programming abstractions designed for optimizing GNN Aggregation have emerged to facilitate acceleration. However, there is no comprehensive evaluation and analysis upon existing abstractions, thus no clear consensus on which approach is better. In this letter, we classify existing programming abstractions for GNN Aggregation by the dimension of data organization and propagation method. By constructing these abstractions on a state-of-the-art GNN library, we perform a thorough and detailed characterization study to compare their performance and efficiency, and provide several insights on future GNN acceleration based on our analysis.
Machine Learning,Artificial Intelligence,Performance
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to comprehensively evaluate and analyze the existing graph neural network (GNN) aggregation programming abstractions to determine which method performs best in different scenarios and provide guidance for future GNN acceleration research?** Specifically, the paper focuses on the following issues: 1. **Lack of comprehensive evaluation**: Currently, there is no systematic comparison and evaluation of the existing GNN aggregation programming abstractions, so it is not clear which method is better. 2. **Performance depends on hardware platforms and input graph structures**: The performance of GNNs highly depends on the hardware platforms used and the structural characteristics of the input graphs, and these dependencies need to be understood. 3. **Selecting appropriate programming abstractions**: Different programming abstractions are suitable for different application scenarios, and the best - use scenarios for each abstraction need to be identified. ### Main content of the paper The paper solves the above problems through the following steps: 1. **Classifying existing programming abstractions**: Classify the existing GNN aggregation programming abstractions according to the dimensions of data organization and propagation methods. 2. **Construction and evaluation**: Implement these abstractions on the state - of - the - art GNN libraries and conduct detailed performance and efficiency evaluations. 3. **Providing insights**: Based on the evaluation results, provide several insights into future GNN acceleration research. ### Key findings - **Relationship between performance and graph scale**: For smaller graphs, scatter - based methods perform better; while for larger graphs, pull - based methods are more efficient. - **Hardware adaptability**: Different data organization methods have a significant impact on the performance of different hardware platforms. The compressed matrix format is usually more efficient, but on high - end platforms, the edge - list format performs excellently. - **Impact of graph structures**: Besides the size of the graph, the intrinsic properties of the graph (such as density, skewness of vertex - degree distribution) also have an important impact on the aggregation performance. ### Conclusion Through a comprehensive evaluation of multiple GNN aggregation programming abstractions, the paper reveals the characteristics that their performance depends on hardware platforms and input graph structures, and provides guidance for programmers and researchers to select appropriate abstractions, which is helpful for future GNN acceleration research.