Cover Edge-Based Novel Triangle Counting

David A. Bader,Fuhuan Li,Zhihui Du,Palina Pauliuchenka,Oliver Alvarado Rodriguez,Anant Gupta,Sai Sri Vastav Minnal,Valmik Nahata,Anya Ganeshan,Ahmet Gundogdu,Jason Lew
2024-03-05
Abstract:Listing and counting triangles in graphs is a key algorithmic kernel for network analyses, including community detection, clustering coefficients, k-trusses, and triangle centrality. In this paper, we propose the novel concept of a cover-edge set that can be used to find triangles more efficiently. Leveraging the breadth-first search (BFS) method, we can quickly generate a compact cover-edge set. Novel sequential and parallel triangle counting algorithms that employ cover-edge sets are presented. The novel sequential algorithm performs competitively with the fastest previous approaches on both real and synthetic graphs, such as those from the Graph500 Benchmark and the MIT/Amazon/IEEE Graph Challenge. We implement 22 sequential algorithms for performance evaluation and comparison. At the same time, we employ OpenMP to parallelize 11 sequential algorithms, presenting an in-depth analysis of their parallel performance. Furthermore, we develop a distributed parallel algorithm that can asymptotically reduce communication on massive graphs. In our estimate from massive-scale Graph500 graphs, our distributed parallel algorithm can reduce the communication on a scale~36 graph by 1156x and on a scale~42 graph by 2368x. Comprehensive experiments are conducted on the recently launched Intel Xeon 8480+ processor and shed light on how graph attributes, such as topology, diameter, and degree distribution, can affect the performance of these algorithms.
Data Structures and Algorithms
What problem does this paper attempt to address?
The paper attempts to address the problem of efficiently finding and counting triangles in large-scale graphs. Triangle listing and counting are key algorithmic kernels in network analysis, including community detection, clustering coefficients, k-triads, and triangle centrality. Traditional methods are inefficient when dealing with large-scale sparse graphs, especially in parallel computing environments where communication overhead becomes a major bottleneck. To this end, the paper proposes a new concept—the cover-edge set, which allows for more efficient identification of all triangles in a graph. Specifically, the main contributions of the paper include: 1. Proposing a new triangle counting algorithm based on the cover-edge set (Cover-Edge Triangle Counting, CETC). 2. Developing various sequential and parallel CETC algorithm variants and conducting detailed performance evaluations and comparisons. 3. Providing open-source implementations of over 22 sequential triangle counting algorithms and 11 OpenMP parallel algorithms. 4. Conducting comprehensive experimental studies on the performance of the proposed new triangle counting algorithms on real and synthetic graphs, and comparing them with existing algorithms. Through these methods, the paper aims to improve the efficiency of triangle counting, reduce computation time and communication overhead, particularly on large-scale graph datasets.