Toward CXL-Native Memory Tiering Via Device-Side Profiling

Zhe Zhu,Yiqi Chen,Tao Zhang,Yang Wang,Ran Shu,Shuotao Xu,Peng Cheng,Lei Qu,Yan Xiong,Guangyu Sun
DOI: https://doi.org/10.48550/arxiv.2403.18702
2024-01-01
Abstract:The Compute Express Link (CXL) interconnect has provided the ability to integrate diverse memory types into servers via byte-addressable SerDes links. Harnessing the full potential of such heterogeneous memory systems requires efficient memory tiering. However, existing research in this domain has been constrained by low-resolution and high-overhead memory access profiling techniques. To address this critical challenge, we propose to enhance existing memory tiering systems with a novel NeoMem solution. NeoMem offloads memory profiling functions to device-side controllers, integrating a dedicated hardware unit called NeoProf. NeoProf readily tracks memory access and provides the operating system with crucial page hotness statistics and other useful system state information. On the OS kernel side, we introduce a revamped memory-tiering strategy, enabling accurate and timely hot page promotion based on NeoProf statistics. We implement NeoMem on a real CXL-enabled FPGA platform and Linux kernel v6.3. Comprehensive evaluations demonstrate that NeoMem achieves 32% to 67% geomean speedup over several existing memory tiering solutions.
What problem does this paper attempt to address?