Resolution Matters: Revisiting Prediction-Based Job Co-location in Public Clouds

Justin Kur,Jingshu Chen,Ji Xue,Jun Huang
DOI: https://doi.org/10.1109/ucc56403.2022.00029
2022-01-01
Abstract:Overall resource utilization in public cloud data centers remains very low. To increase the efficiency of these data centers, low priority batch jobs are often co-located on the same machines as latency-sensitive jobs. Existing methodologies have used machine learning to predict the amount of resources that should be reserved for these jobs to maintain acceptable latency. However, these methodologies overlook the impact of measurement granularity on usage prediction and scheduling performance. When batch jobs have long durations, coarsegrained data can be used to make the prediction problem less challenging, but the resulting predictions may degrade scheduling performance. In this paper, we investigate the impact of measurement granularity on scheduler performance using extensive trace-driven simulation and job data generated from the Alibaba cluster trace.
What problem does this paper attempt to address?