CHAM: Improving Prefetch Efficiency Using a Composite Hierarchy-Aware Method
Cheng Qian,Libo Huang,Qi Yu,Zhiying Wang
DOI: https://doi.org/10.1142/s0218126618501141
2017-01-01
Journal of Circuits Systems and Computers
Abstract:Hardware prefetching has always been a crucial mechanism to improve processor performance. However, an efficient prefetch operation requires a guarantee of high prefetch accuracy; otherwise, it may degrade system performance. Prior studies propose an adaptive priority controlling method to make better use of prefetch accesses, which improves performance in two-level cache systems. However, this method does not perform well in a more complex memory hierarchy, such as a three-level cache system. Thus, it is still necessary to explore the efficiency of prefetch, in particular, in complex hierarchical memory systems. In this paper, we propose a composite hierarchy-aware method called CHAM, which works at the middle level cache (MLC). By using prefetch accuracy as an evaluation criterion, CHAM improves the efficiency of prefetch accesses based on (1) a dynamic adaptive prefetch control mechanism to schedule the priority and data transfer of prefetch accesses across the cache hierarchical levels in the runtime and (2) a prefetch efficiency-oriented hybrid cache replacement policy to select the most suitable policy. To demonstrate its effectiveness, we have performed extensive experiments on 28 benchmarks from SPEC CPU2006 and two benchmarks from BioBench. Compared with a similar adaptive method, CHAM improves the MLC demand hit rate by 9.2% and an improvement of 1.4% in system performance on average in a single-core system. On a 4-core system, CHAM improves the demand hit rate by 33.06% and improves system performance by 10.1% on average.