CLAP: Locality Aware and Parallel Triangle Counting with Content Addressable Memory

Tianyu Fu,Chiyue Wei,Zhenhua Zhu,Shang Yang,Zhongming Yu,Guohao Dai,Huazhong Yang,Yu Wang
DOI: https://doi.org/10.23919/DATE56975.2023.10136997
2023-01-01
Abstract:Triangle counting (TC) is one of the most fundamental graph analysis tools with a wide range of applications. Modern triangle counting algorithms traverse the graph and perform set intersections of neighbor sets to find triangles. However, existing triangle counting approaches suffer from the heavy off-chip memory access and set intersection overhead. Thus, we propose CLAP, the first content addressable memory (CAM) based triangle counting architecture with the software and hardware co-optimizations. To reduce off-chip memory access and the number of set intersections, we propose the first force-based node index reorder method. It simultaneously optimizes both data locality and the computation amount. Compared with random node indices, the reorder method reduces the off-chip memory access and the set intersections by 61% and 64%, respectively, while providing 2.19x end-to-end speedup. To improve the set intersection parallelism, we propose the first CAM-based triangle counting architecture under chip area constraints. We enable the high parallel set intersection by translating it into content search on CAM with full parallelism. Thus, the time complexity of the set intersection reduces from O(m+ n) or O(n logm) to O(n). Extensive experiments on real-world graphs show that CLAP achieves 39x, 27x, and 78x speedup over state-of-the-art CPU, GPU, and processing-in-memory baselines, respectively. The software code is available at: https://github.com/thu-nics/CLAP-triangle-counting.
What problem does this paper attempt to address?