Efficient Two-Level Scheduling for Concurrent Graph Processing

Jin Zhao
DOI: https://doi.org/10.48550/arXiv.1806.00777
2018-06-03
Abstract:With the rapidly growing demand of graph processing in the real scene, they have to efficiently handle massive concurrent jobs. Although existing work enable to efficiently handle single graph processing job, there are plenty of memory access redundancy caused by ignoring the characteristic of data access correlations. Motivated such an observation, we proposed two-level scheduling strategy in this paper, which enables to enhance the efficiency of data access and to accelerate the convergence speed of concurrent jobs. Firstly, correlations-aware job scheduling allows concurrent jobs to process the same graph data in Cache, which fundamentally alleviates the challenge of CPU repeatedly accessing the same graph data in memory. Secondly, multiple priority-based data scheduling provides the support of prioritized iteration for concurrent jobs, which is based on the global priority generated by individual priority of each job. Simultaneously, we adopt block priority instead of fine-grained priority to schedule graph data to decrease the computation cost. In particular, two-level scheduling significantly advance over the state-of-the-art because it works in the interlayer between data and systems.
Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?