AsyncStripe: I/O efficient asynchronous graph computing on a single server.

Shu-han Cheng,Guangyan Zhang,Jiwu Shu,Weimin Zheng
DOI: https://doi.org/10.1145/2968456.2968473
2016-01-01
Abstract:Large-scale graphs can be analyzed by lightweight systems on a single server, e.g., GraphChi, X-Stream, and Grid-Graph. Studies indicate that graph algorithms have different performance impacted by partitioning schemes, scheduling strategies and execution models. Existing systems of singleserver graph processing often suffer from poor I/O locality, inefficient selective scheduling or expensive synchronization costs. In this paper, we propose AsyncStripe, an I/O efficient asynchronous graph system aiming at solving all these three problems on a single server. First, AsyncStripe adopts a 2--dimensional asymmetric graph partitioning scheme to enable optimized locality and fine selective-scheduling. Second, AsyncStripe utilizes an efficient stripe-based data access strategy, obtaining high disk bandwidth and smaller I/O amount. Third, AsyncStripe executes graph algorithms asynchronously with two kinds of consistency models, therefore reducing unnecessary intermediate I/O and accelerating the convergence speed. We implement the AsyncStripe prototype by modifying the GridGraph system. Experimental results show that Async-Stripe has better performance than state-of-the-art graph analysis systems. For traversal algorithms, e.g., BFS, Async-Stripe can outperform X-Stream and GridGraph in the computation speed by up to 10.18 and 4.59 times faster respectively. For sparse matrix multiplication algorithms, e.g., SpMV, AsyncStripe can outperform X-Stream and Grid-Graph by up to 463.89% and 50.97% faster respectively.
What problem does this paper attempt to address?