Implement the Grid Workflow Scheduling for Data Intensive Applications with CSF4

Zhaohui Ding,Xiaohui Wei,Yifan Zhu,Yaoguang Yuan,Wilfred W. Li,Osamu Tatebe
DOI: https://doi.org/10.1109/eScience.2008.34
2008-01-01
Abstract:Grid computing technology is able to integrate and share large-scale distributed computation and data resource to facilitate the scientific researches. Recently, the grid workflow support and large-scale distributed data management are becoming two main requirements of scientists and researchers in many fields, such as bioinformatics, high-energy physics etc. In this paper, we proposed to support grid workflow for data intensive applications using CSF4 scheduling plug-ins. The grid workflow scheduling and data aware scheduling policies are implemented in two scheduling plug-ins, grid workflow plug-in and grid data aware plug-in, respectively. The two scheduling plug-ins can work together smoothly. The data aware plug-in will automatically dispatch the workflow tasks to the grid sites which are close to data replicas. At last, the experiment results are given to show the improvement of system performance and optimization of scheduling.
What problem does this paper attempt to address?