Squeezing out All the Value of Loaded Data: an Out-of-core Graph Processing System with Reduced Disk I/O

Zhiyuan Ai,Mingxing Zhang,Yongwei Wu,Xuehai Qian,Kang Chen,Weimin Zheng
2017-01-01
Abstract:The current primary concern of out-of-core graph processing systems is improving disk I/O locality, which leads to certain restrictions on their programming and execution models. Although improving the locality, these constraints also restrict the expressiveness. As a result, only sub-optimal algorithms are supported for many kinds of applications. When compared with the optimal algorithms, these supported algorithms typically incur sequential, but much larger, amount of disk I/O. In this paper, we explore a fundamentally different tradeoff: less total amount of I/O rather than better locality. We show that out-of-core graph processing systems uniquely provide the opportunities to lift the restrictions of the programming and execution model (e.g., process each loaded block at most once, neighborhood constraint) in a feasible manner, which enable efficient algorithms that require drastically less number of iterations. To demonstrate the ideas, we build CLIP, a novel out-of core graph processing system designed with the principle of "squeezing out all the value of loaded data". With the more expressive programming model and more flexible execution, CLIP enables more efficient algorithms that require much less amount of total disk I/O. Our experiments show that the algorithms that can be only implemented in CLIP are much faster than the original disk locality -optimized algorithms in many real-world cases (up to tens or even thousands of times speedup).
What problem does this paper attempt to address?