Abstract:Scheduling of processes onto processors of a parallel machine has always been an important and challenging area of research. The issue becomes even more crucial and difficult as we gradually progress to the use of off-the-shelf workstations, operating systems, and high bandwidth networks to build cost-effective clusters for demanding applications. Clusters are gaining acceptance not just in scientific applications that need supercomputing power, but also in domains such as databases, web service, and multimedia which place diverse Quality-of-Service (QoS) demands on the underlying system. Further, these applications have diverse characteristics in terms of their computation, communication, and I/O requirements, making conventional parallel scheduling solutions, such as space sharing or gang scheduling, unattractive. At the same time, leaving it to the native operating system of each node to make decisions independently can lead to ineffective use of system resources whenever there is communication. Instead, an emerging class of dynamic coscheduling mechanisms that attempt to take remedial actions to guide the system toward coscheduled execution without requiring explicit synchronization offers a lot of promise for cluster scheduling. Using a detailed simulator, this paper evaluates the pros and cons of different dynamic coscheduling alternatives while comparing their advantages over traditional gang scheduling (and not performing any coordinated scheduling at all). The impact of dynamic job arrivals, job characteristics, and different system parameters on these alternatives is evaluated in terms of several performance criteria. In addition, heuristics to enhance one of the alternatives even further are identified, classified, and evaluated. It is shown that these heuristics can significantly outperform the other alternatives over a spectrum of workload and system parameters and is thus a much better option for clusters than conventional gang scheduling.

Exploring Plan-Based Scheduling for Large-Scale Computing Systems

A novel hybrid differential evolution approach to scheduling of large-scale zero-wait batch processes with setup times

A Deadline And Budget Constrained Cost-Time Optimization Algorithm For Scheduling Dependent Tasks In Grid Computing

Scalable System Scheduling for HPC and Big Data

Job Scheduling in High Performance Computing

A Heuristic Search Algorithm Based on Hybrid-Tasks System Model for Scheduling Tasks of NC System

Optimization Service for Complex Process System in Grid Environment and Its Task Scheduling Strategy

Hybrid Workload Scheduling on HPC Systems

A Progressive Hedging-Based Solution Approach for Integrated Planning and Scheduling Problems under Demand Uncertainty

A HPC Co-Scheduler with Reinforcement Learning

Node-Based Job Scheduling for Large Scale Simulations of Short Running Jobs

Optimisation of job scheduling for supercomputers with burst buffers

Exploring the Relation Between Two Levels of Scheduling Using a Novel Simulation Approach

Mitigating Resource Contention on Multicore Systems Via Scheduling

Energy-efficient Task Scheduling on Heterogeneous Computing Systems by Linear Programming

Impact of Workload and System Parameters on Next Generation Cluster Scheduling Mechanisms

A scheduling algorithm for heterogeneous computing systems by edge cover queue

ASA -- The Adaptive Scheduling Algorithm

Performance and energy aware scheduling simulator for HPC: evaluating different resource selection methods

Task Scheduling Using Bayesian Optimization Algorithm for Heterogeneous Computing Environments

Improving Parallel Job Scheduling by Combining Gang Scheduling and Backfilling Techniques