GPU-based Butterfly Counting

Yifei Xia,Feng Zhang,Qingyu Xu,Mingde Zhang,Zhiming Yao,Lv Lu,Xiaoyong Du,Dong Deng,Bingsheng He,Siqi Ma
DOI: https://doi.org/10.1007/s00778-024-00861-0
2024-01-01
Abstract:Butterfly counting is an important and costly operation for large bipartite graphs. GPUs are popular parallel heterogeneous devices and can bring significant performance improvement for data science applications. Unfortunately, no work enables efficient butterfly counting on GPU currently. To fill this gap, we propose a GPU-based butterfly counting, called G-BFC. G-BFC addresses three main technical challenges. First, butterfly counting involves massive serial operations, which leads to severe synchronization overheads and performance degradation. We unlock the serial region and utilize the shared memory on GPU to efficiently handle it. Second, butterfly counting on GPU faces the workload imbalance problem. We develop a novel adaptive strategy to balance the workload among threads for efficiency. Third, butterfly counting in parallel suffers from the traversal of the huge amount of two-hop paths, also called wedges, in bipartite graphs. We develop a novel preprocessing strategy, which can effectively reduce the number of wedges to be traversed. Experiments show that G-BFC brings significant performance benefits. On eleven real datasets, G-BFC achieves 19.8x performance speedup over the state-of-the-art solution.
What problem does this paper attempt to address?