Online Distributed Scheduling on a Fault-prone Parallel System
Elli Zavou,Antonio Fernández Anta
DOI: https://doi.org/10.48550/arXiv.1603.05939
2016-03-19
Abstract:We consider a parallel system of $m$ identical machines prone to unpredictable crashes and restarts, trying to cope with the continuous arrival of tasks to be executed. Tasks have different computational requirements (i.e., processing time or size). The flow of tasks, their size, and the crash and restart of the machines are assumed to be controlled by an adversary. Then, we focus on the study of online distributed algorithms for the efficient scheduling of the tasks. We use competitive analysis, considering as efficiency metric the completed-load, i.e., the aggregated size of the completed tasks. We first present optimal completed-load competitiveness algorithms when the number of different task sizes that can be injected by the adversary is bounded. (It is known that, if it is not bounded, competitiveness is not achievable.) We first consider only two different task sizes, and then proceed to $k$ different ones, showing in both cases that the optimal completed-load competitiveness can be achieved. Then, we consider the possibility of having some form of resource augmentation, allowing the scheduling algorithm to run with a speedup $s \geq 1$. In this case, we show that the competitiveness of all work-conserving scheduling algorithms can be increased by using a large enough speedup.
Distributed, Parallel, and Cluster Computing