Execution Feature Extraction and Prediction for Large-Scale Graph Processing Applications

Fangyuan Li
DOI: https://doi.org/10.1109/cbd.2019.00025
2019-09-01
Abstract:With the rapid development of social networks, recommendation systems, bioinformatics networks and web pages, huge amount of data are modeled as graphs to excavate the inner valuable information. To meet the requirement of processing graph data on hundreds of millions of vertexes, it is necessary to depend on Cloud platform, drawing support from its computing power and mass storage capacity for distributed computing and processing of graph data. In this condition, it has become the focus of current attention on how to analyze and extract the corresponding execution mode according to the features of graph data processing applications, and then guide the efficient allocation of resources in a cloud environment to accelerate the execution speed of large-scale graph data. However, due to logical differences between graph algorithms and the complexity of graph topology, the execution modes of large-scale graph data applications often perform great diversity and dynamicity, which increases the difficulty of resource demand estimation. Therefore, this paper carries out a secondary development based on an existing open source large-scale graph data processing system, extracts the execution features which could have impact on the resource demands of graph data applications, then analyses and predicts the execution mode of the given graph data application under the corresponding graph data structure, so that we can dynamically estimate the resource demands of the graph data applications and provide support for highly efficient cloud resource allocation.
What problem does this paper attempt to address?