Warp-Aware Adaptive Energy Efficiency Calibration for Multi-GPU Systems

Zhuowei Wang,Xiaoyu Song,Lianglun Cheng,Hai Wan,Wuqing Zhao,Tao Wang
DOI: https://doi.org/10.1109/tcad.2022.3200528
IF: 2.9
2023-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:Massive GPU acceleration processors have been used in high-performance computing systems. The Dennard scaling has led to power and thermal constraints limiting the performance of such systems. The demand for both increased performance and energy efficiency is highly desired. This article presents a multilayer low-power optimization method for warps and tasks parallelisms. We present a dynamic frequency regulation scheme for performance parameters in terms of load balance and load imbalance. The method monitors the energy parameters in runtime and adjusts adaptively the voltage level to ensure performance efficiency with energy reduction. The experimental results show that the multilayer low-power optimization with dynamic frequency regulation can achieve 40% energy consumption reduction with only 1.6% performance degradation, thus reducing 59% maximum energy consumption. It can further save about 30% energy consumption in comparison with the single-layer energy optimization.
What problem does this paper attempt to address?