Parallel Algorithms for Counting Triangles in Networks with Large Degrees

Shaikh Arifuzzaman,Maleq Khan,Madhav Marathe
DOI: https://doi.org/10.48550/arXiv.1406.5687
2014-06-22
Distributed, Parallel, and Cluster Computing
Abstract:Finding the number of triangles in a network is an important problem in the analysis of complex networks. The number of triangles also has important applications in data mining. Existing distributed memory parallel algorithms for counting triangles are either Map-Reduce based or message passing interface (MPI) based and work with overlapping partitions of the given network. These algorithms are designed for very sparse networks and do not work well when the degrees of the nodes are relatively larger. For networks with larger degrees, Map-Reduce based algorithm generates prohibitively large intermediate data, and in MPI based algorithms with overlapping partitions, each partition can grow as large as the original network, wiping out the benefit of partitioning the network. In this paper, we present two efficient MPI-based parallel algorithms for counting triangles in massive networks with large degrees. The first algorithm is a space-efficient algorithm for networks that do not fit in the main memory of a single compute node. This algorithm divides the network into non-overlapping partitions. The second algorithm is for the case where the main memory of each node is large enough to contain the entire network. We observe that for such a case, computation load can be balanced dynamically and present a dynamic load balancing scheme which improves the performance significantly. Both of our algorithms scale well to large networks and to a large number of processors.
What problem does this paper attempt to address?