Thinking Like a Vertex: a Survey of Vertex-Centric Frameworks for Distributed Graph Processing

Robert Ryan McCune,Tim Weninger,Gregory Madey
DOI: https://doi.org/10.48550/arXiv.1507.04405
2015-07-16
Abstract:The vertex-centric programming model is an established computational paradigm recently incorporated into distributed processing frameworks to address challenges in large-scale graph processing. Billion-node graphs that exceed the memory capacity of standard machines are not well-supported by popular Big Data tools like MapReduce, which are notoriously poor-performing for iterative graph algorithms such as PageRank. In response, a new type of framework challenges one to Think Like A Vertex (TLAV) and implements user-defined programs from the perspective of a vertex rather than a graph. Such an approach improves locality, demonstrates linear scalability, and provides a natural way to express and compute many iterative graph algorithms. These frameworks are simple to program and widely applicable, but, like an operating system, are composed of several intricate, interdependent components, of which a thorough understanding is necessary in order to elicit top performance at scale. To this end, the first comprehensive survey of TLAV frameworks is presented. In this survey, the vertex-centric approach to graph processing is overviewed, TLAV frameworks are deconstructed into four main components and respectively analyzed, and TLAV implementations are reviewed and categorized.
Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?