Pimiento: A Vertex-Centric Graph-Processing Framework on a Single Machine

Jianqiang Huang,Wei Qin,Xiaoying Wang,Wenguang Chen
DOI: https://doi.org/10.1007/978-3-030-38961-1_5
2020-01-01
Abstract:Here, we describe a method for handling large graphs with data sizes exceeding memory capacity using minimal hardware resources. This method (called Pimiento) is a vertex-centric graph-processing framework on a single machine and represents a semi-external graph-computing system, where all vertices are stored in memory, and all edges are stored externally in compressed sparse row data-storage format. Pimiento uses a multi-core CPU, memory, and multi-threaded data preprocessing to optimize disk I/O in order to reduce random-access overhead in the graph-algorithm implementation process. An on-the-fly update-accumulated mechanism was designed to reduce the time that the graph algorithm accesses disks during execution. Our experiments compared external this method with other graph-processing systems, including GraphChi, X-Stream, and FlashGraph, revealing that Pimiento achieved 7.5x, 4x, 1.6x better performance on large real-world graphs and synthetic graphs in the same experimental environment.
What problem does this paper attempt to address?