A fault-tolerant scheduling system for computational grids

Mohammed Amoon
DOI: https://doi.org/10.1016/j.compeleceng.2011.11.004
2012-03-01
Abstract:Fault-tolerant scheduling is an important issue for computational grid systems, as grids typically consist of strongly varying and geographically distributed resources. The main scheduling strategy of most fault-tolerant scheduling systems depends on the response time and fault index when selecting a resource to execute a certain job.In this paper, a scheduling system is presented that depends on a new factor called scheduling indicator in selecting resources. This factor comprises of the response time and the failure rate of grid resources. Whenever a grid scheduler has jobs to schedule on grid resources, it uses the scheduling indicator to generate the scheduling decisions. The main scheduling strategy of the system is to select resources that have the lowest tendency to fail. Extensive simulation experiments are conducted to quantify the performance of the proposed system. Experiments have shown that the proposed system can considerably improve grid performance in terms of throughput, unavailability, turnaround time, and fail tendency.
engineering, electrical & electronic,computer science, interdisciplinary applications, hardware & architecture
What problem does this paper attempt to address?