An Improved Scheme of Victim Replication in Tiled Chip Multiprocessors

Qianqian Wu,Zhenzhou Ji
DOI: https://doi.org/10.1109/ICCSD.2019.8842919
2019-01-01
Abstract:The last level cache (LLC) in the shared configuration increases the effective cache capacity by not allowing replication but causes a long on-chip access latency when the data is on a remote tile. The previously proposed victim replication scheme allowed replicating victims evicted from L1 to its local LLC slice in order to reduce the on-chip access latency of subsequent L1 misses. However, this proposal loses sight of the impact of locality in all levels of cache and the L1 victim re-reference interval on replication and replicates lots of useless replicas, which results in limited performance improvements. In this paper, we propose a novel victim selective replication scheme based on L1 temporal locality, LLC reuse locality and L1 victim re-reference interval (VSR_TRV). We selectively replicate a victim that is detected as a reuse in a short L1 victim re-reference interval or recognized as the first time LLC access but only receives one L1 access, and filter out the replication of a victim that is recognized as the first time LLC access. The experimental results show that VSR_TRV can improve performance by 4.67% on average and by 11.22% at best over the previously proposed VR scheme. In addition, our proposal only incurs 1.48% storage overhead compared to that of the baseline system.
What problem does this paper attempt to address?