SMART: on simultaneously marching racetracks to improve the performance of racetrack-based main memory

Xiangjun Peng,Ming-Chang Yang,Ho Ming Tsui,Chi Ngai Leung,Wang Kang
DOI: https://doi.org/10.1145/3489517.3530538
2022-01-01
Abstract:RaceTrack Memory (RTM) is a promising media for modern Main Memory subsystems. However, the "shift-before-access" principle, as the nature of RTM, introduces considerable overheads to the access latency. To obtain more insights for the mitigation of shift overheads, this work characterizes and observes that the access patterns, exhibited by the state-of-the-art RTM-based Main Memory, mismatches with the granularity of shift commands (i.e., a group of RaceTracks called Domain Block Cluster (DBC)). Based on the characterization, we propose a novel mechanism called SMART, which simultaneously and proactively marches all DBCs within a subarray, so that subsequent accesses to other DBCs can be served without additional shift commands. Evaluation results show that, averaged across 15 real-world workloads, SMART significantly outperforms other state-of-the-art proposals of RTM-based Main Memory by at least 1.53X in terms of the total execution time, on two different generations of RTM technologies.
What problem does this paper attempt to address?