CRQ-based fair scheduling on composable multicore architectures.
Tao Sun,Hong An,Tao Wang,Haibo Zhang,Xiufeng Sui
DOI: https://doi.org/10.1145/2304576.2304600
2012-01-01
Abstract:ABSTRACTAs different workloads require different processor resources for better execution efficiency, recent work has proposed composable chip multiprocessors (CCMPs), which provide the capability to configure different number and types of processing cores at system runtime. However, such composable architecture poses a new significant challenge to system scheduler, that is, how to ensure priority-based performance for each task (i.e. fairness), while exploiting the benefits of composability by dynamically changing the hardware configurations to match the parallelism requirements in running tasks (i.e. resource allocation). Current multicore schedulers fail to address this problem, as they traditionally assume fixed number and types of cores. In this work, we introduce centralized run queue (CRQ) and propose an efficiency-based algorithm to address the fair scheduling problem on CCMP. Firstly, instead of using distributed per-core run queues, this paper employs CRQ to simplify the scheduling and resource allocation decisions on CCMP, and proposes a pipeline-like scheduling mechanism to hide the large scheduling decision overhead on the centralized queue. Secondly, an efficiency-based dynamic priority (EDP) algorithm is proposed to keep fair scheduling on CCMP, which can not only provide homogenous tasks with performance proportional to their priorities, but also ensure equal-priority heterogeneous tasks to get equivalent performance slowdowns when running simultaneously. To evaluate our design, experimental studies are carried out to compare EDP on CCMP with several state-of-art fair schedulers on symmetric and asymmetric CMPs. Our simulation results demonstrate that, while providing good fairness, EDP on CCMP outperforms the best performing fair scheduler on fixed symmetric and asymmetric CMPs by as much as 11.8% in user-oriented performance, and by 12.5% in system throughput.