Chenxi Wang,Yifan Qiao,Haoran Ma,Shi Liu,Yiying Zhang,Wenguang Chen,Ravi Netravali,Miryung Kim,Guoqing Harry Xu
Abstract:Remote memory techniques for datacenter applications have recently gained a great deal of popularity. Existing remote memory techniques focus on the efficiency of a single application setting only. However, when multiple applications co-run on a remote-memory system, significant interference could occur, resulting in unexpected slowdowns even if the same amounts of physical resources are granted to each application. This slowdown stems from massive sharing in applications' swap data paths. Canvas is a redesigned swap system that fully isolates swap paths for remote-memory applications. Canvas allows each application to possess its dedicated swap partition, swap cache, prefetcher, and RDMA bandwidth. Swap isolation lays a foundation for adaptive optimization techniques based on each application's own access patterns and needs. We develop three such techniques: (1) adaptive swap entry allocation, (2) semantics-aware prefetching, and (3) two-dimensional RDMA scheduling. A thorough evaluation with a set of widely-deployed applications demonstrates that Canvas minimizes performance variation and dramatically reduces performance degradation.
Operating Systems,Distributed, Parallel, and Cluster Computing,Networking and Internet Architecture
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the performance interference problem that occurs when multiple applications co - exist in a remote memory system. Specifically:
1. **Severe lock contention**: Current swapping systems, when multiple applications share swapping resources (such as swap partitions, RDMA, etc.), due to the need to frequently allocate swap entries, lead to severe lock contention, which reduces throughput and hinders the full utilization of RDMA bandwidth. For example, during frequent remote access windows, an application may spend up to 70% of its time obtaining swap entries.
2. **Uncontrolled use of swap resources** (e.g., RDMA bandwidth): Shared RDMA bandwidth is often dominated by applications with many threads that perform frequent remote accesses simultaneously. For example, aggressively (pre - )fetching pages to meet the needs of one application may disproportionately reduce the bandwidth usage of other applications. Moreover, even within an application, resource competition between pre - fetching and demand - swapping can lead to extended fault - handling times or pre - fetching delays, and pages cannot be brought back in a timely manner.
3. **Reduced pre - fetching efficiency**: Current kernel pre - fetchers are built based on low - level access patterns (such as sequential or strided), which are useful for applications that use arrays extensively. However, many cloud applications are written in high - level managed languages (such as Java or Python), and their accesses come from multiple threads or exhibit pointer - chasing behavior rather than sequential or strided patterns. Therefore, when multiple applications co - exist, this single pre - fetching strategy has difficulty working effectively. For example, running Spark and native applications together will reduce the pre - fetching contribution of Leap by 3.19 times.
To address these problems, the paper proposes Canvas, a redesigned swapping system that completely isolates the swapping paths by providing each application with a dedicated swap partition, swap cache, pre - fetcher, and RDMA bandwidth. Canvas also develops three adaptive optimization techniques: (1) adaptive swap entry allocation, (2) semantic - aware pre - fetching, and (3) two - dimensional RDMA scheduling. These techniques are optimized based on each application's own access patterns and requirements, thereby minimizing performance variation and significantly reducing performance degradation.