GNNBENCH: Fair and Productive Benchmarking for Single-GPU GNN System

Yidong Gong,Pradeep Kumar
2024-04-05
Abstract:We hypothesize that the absence of a standardized benchmark has allowed several fundamental pitfalls in GNN System design and evaluation that the community has overlooked. In this work, we propose GNNBench, a plug-and-play benchmarking platform focused on system innovation. GNNBench presents a new protocol to exchange their captive tensor data, supports custom classes in System APIs, and allows automatic integration of the same system module to many deep learning frameworks, such as PyTorch and TensorFlow. To demonstrate the importance of such a benchmark framework, we integrated several GNN systems. Our results show that integration with GNNBench helped us identify several measurement issues that deserve attention from the community.
Machine Learning,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
The problem addressed in this paper is the lack of standardized benchmark tests for Graph Neural Network (GNN) systems, which leads to overlooked fundamental issues in design and evaluation. GNNs have become important in many data-driven applications, but existing benchmarks like Graph500 or LDBC are not suitable for innovation in GNN systems, as GNNs involve forward and backward computations, intermediate result storage, vector features of vertices and edges, and dependency on deep learning frameworks. The main contribution of the paper is GNNBENCH, a modular and scalable benchmarking platform that focuses on system innovation. GNNBENCH addresses the design and evaluation issues in existing GNN systems by supporting custom classes and automatic framework integration through a stable system API. It allows system researchers to rapidly prototype the system aspects of GNNs and enables fair and effective evaluations through GNNBENCH. Additionally, GNNBENCH provides a simple Domain-Specific Language (DSL) for generating crucial system integration code, avoiding common pitfalls. The paper also points out measurement issues in some GNN systems, such as incorrect backward propagation calls and unimplemented transpose Sparse Matrix-Matrix (SpMM), which are revealed during the integration process of GNNBENCH. By providing independent system modules and a unified front-end code that is decoupled from deep learning frameworks, GNNBENCH ensures fair comparisons and reduces the workload for future research. In summary, GNNBENCH aims to address inconsistencies in GNN system design and evaluation, promoting fair comparisons and productivity improvement through standardized benchmark tests.