Reducing Communication in Parallel Breadth-First Search on Distributed Memory Systems

Huiwei Lv,Guangming Tan,Mingyu Chen,Ninghui Sun
DOI: https://doi.org/10.1109/CSE.2014.243
2014-01-01
Abstract:Breadth-first search (BFS) is a key operation in data-intensive graph analysis applications. However, for distributed BFS algorithm on large distributed memory systems, data communication often limits the scalability of the algorithm as it costs significantly more than arithmetic computation. In this work, we try to reduce the communication cost in distributed BFS by sieving and compressing the messages. First, we propose a novel distributed directory to sieve the redundant data in collective communications. Then we leverage a bitmap compression algorithm to further reduce the size of messages in communication. Experiments on a 6,144-core Intel West mere based cluster show our algorithm achieve a BFS performance rate of 12.1 billion edge visits per second on an undirected graph of 8 billion vertices and 128 billion edges with scale-free distribution. Compared to the "replicated-csr" version BFS in Graph500, our algorithm reduces communication cost by 79.0% and gets a speedup of 2.2×.
What problem does this paper attempt to address?