Storage Aware Resource Allocation for Grid Data Streaming Pipelines
Wen Zhang,Junwei Cao,Yisheng Zhong,Lianchen Liu,Cheng Wu
DOI: https://doi.org/10.1109/nas.2008.24
2008-01-01
Abstract:Data streaming applications, usually composed with sequential/parallel tasks in a data pipeline form, bring new challenges to task scheduling and resource allocation in grid environments. Due to high volumes of data and relatively limit storage capability, resource allocation and data streaming have to be storage aware. In this paper, genetic algorithm (GA) is adopted for task scheduling of pipelines, based on on-line measurement and prediction with gray model (GM). On-demand data streaming is introduced to avoid data overflow using repertory strategies. Experimental results show that balance among task executions with on-demand data streaming is required to improve overall performance, avoid system bottlenecks and backlogs of intermediate data, and increase data throughput of pipelines as a whole.
What problem does this paper attempt to address?