Dynamic Front-End Sharing in Graphics Processing Units

Tao Zhang,Xiaoyao Liang
DOI: https://doi.org/10.1109/iccd.2014.6974695
2014-01-01
Abstract:A modern GPU processor consumes several times power of a multi-core CPU and delivers a much higher processing throughput. Researchers propose various architectural innovations to improve its energy efficiency. We observe that different streaming processors (SMs) in a GPU tend to exhibit very similar behavior for many GPU workloads. If multiple SMs can be grouped together and work in synchronous manner, it is possible to save energy by sharing the front-end in the SM pipeline including the instruction fetch, decode and schedule units. For efficient flow control and program correctness, the proposed architecture can identify unfavorable conditions and ungroup the SMs when necessary. However, sharing pipeline front-end between multiple SMs brings architectural challenges. In this paper, we show our design, implementation and evaluation for such an architecture. Detailed experiment results manifest 33.7% front-end and 6.8% total GPU energy reduction can be achieved.
What problem does this paper attempt to address?