A Power Provision and Capping Architecture for Large Scale Systems

Yongpeng Liu,Hong Zhu,Kai Lu,Yongyan Liu
DOI: https://doi.org/10.1109/ipdpsw.2012.117
2012-01-01
Abstract:The rapid growth of large scale computing systems imposes a grave challenge to their power management, where power provision and capping is essential. In this paper, we propose a new architecture of power provision and capping to control the power consumption of large scale clusters. In this architecture, performance sensitive computation units are distinguished from those having less impact on system performance. A subset of units is monitored and their operation states are controlled in order to maintain whole system's total power consumption under budget. Two policies are designed and implemented to select the target subset of nodes for power regulation. One policy is state-based, which chooses nodes running the most power consuming job for power regulation. The other is change-based, which chooses those nodes that runs a job whose power consumption increases most rapidly among all jobs. Experiments have been conducted on the Tianhe-1A supercomputer system to evaluate the effectiveness of these power capping solutions. The experiments demonstrated that the new architecture can ensure power usage safety with only a negligible decline of performance, which is only about 2%.
What problem does this paper attempt to address?