A Comprehensive Study of Executing Ahead Mechanism for In-Order Microprocessors

WANG Xiaoyin,TONG Dong,DANG Xianglei,LU Junlin,CHENG Xu
DOI: https://doi.org/10.13209/j.0479-8023.2011.006
2011-01-01
Abstract:The authors explore the design space of in-order executing ahead processors,and conduct sensitivity analysis ofthe executing ahead mechanism to the cache hierarchy and memory latency.It is demonstrated that reusing the pre-executed results is highly effective in improving performance and reducing energy consumption.The results also show that propagating valid data values between stores and dependent loads with a small store cache increases performance significantly.An in-order executing ahead processor with a 32-entry store cache and a 128-entry FIFO for preserving andreusing results increases performance by 24.07% over the baseline processor,with an energy overhead of 4.93%.Furthermore,it is revealed that executing ahead is necessary for hiding memory access latencies even with a very largecache hierarchy.With increasing memory latency,the performance and energy-efficiency benefits provided by executing ahead are more significant.
What problem does this paper attempt to address?