Asynchronous Memory Access Unit: Exploiting Massive Parallelism for Far Memory Access

Luming Wang,Xu Zhang,Songyue Wang,Zhuolun Jiang,Tianyue Lu,Mingyu Chen,Siwei Luo,Keji Huang
2024-04-17
Abstract:The growing memory demands of modern applications have driven the adoption of far memory technologies in data centers to provide cost-effective, high-capacity memory solutions. However, far memory presents new performance challenges because its access latencies are significantly longer and more variable than local DRAM. For applications to achieve acceptable performance on far memory, a high degree of memory-level parallelism (MLP) is needed to tolerate the long access latency. While modern out-of-order processors are capable of exploiting a certain degree of MLP, they are constrained by resource limitations and hardware complexity. The key obstacle is the synchronous memory access semantics of traditional load/store instructions, which occupy critical hardware resources for a long time. The longer far memory latencies exacerbate this limitation.
Hardware Architecture
What problem does this paper attempt to address?