Evaluating the Memory System Performance of Software-Initiated Inter-core LLC Prepushing

Min Cai,Zhimin Gu
DOI: https://doi.org/10.1109/ispaw.2011.56
2011-01-01
Abstract:Data prefetching speculatively issue memory requests for data needed later by the main computation, and therefore can lead to increased stress on limited resources on chip multiprocessors. If not properly used, it can cause harmful effects such as cache pollution and waste of bandwidth. Therefore, accurate and fine grain measurement of the related runtime metrics is important as the first step in reducing harmful prefetches and increasing memory level parallelism on chip multiprocessors. However, the required measurement is prohibitively impossible on real machines without bringing nontrivial performance overhead and thus leading to inaccurate results. In this paper, we use cycle accurate full-system simulation to study the memory system performance of our previous proposed data prefetching technique with control of harmful prefetches on chip multiprocessors - software-initiated inter-core LLC prepushing. We modified the GEMS multiprocessor simulator to support trace-based measurement and offline analysis of MLP, DRAM BLP and their relationship with software-initiated intercore LLC prepushing. Results show that, prepushing can achieve speedups of 1.628, 1.019 and 1.032 in mst, em3d and 429.mcf, respectively. Average L2 MLP is increased by 26%, 0.3% and-1%, in mst, em3d and 429.mcf, respectively.
What problem does this paper attempt to address?