Abstract:Modern datacenter applications are increasingly being built using a microservices architecture. These microservices communicate with each other using datacenter RPCs. RPC's pass by value semantics incur redundant data movement along the network, especially for data-intensive applications. Naively introducing a shared global address space to datacenter RPC does not work as it would couple microservices and require microservices to handle data consistency, significantly complicating the development and deployment of applications. Fortunately, the modern datacenter is embracing disaggregated memory (DM). In a DM-enabled datacenter, servers running the microservices can be all connected to one global disaggregated memory pool, thus the pass by value semantics can be replaced by pass by reference. However, prior work on DM requires complicated synchronization primitives to share data across physical machines, so naively adopting them to datacenter RPC would harm microservices' agility and modularity. To this end, we present DmRPC, a DM-aware datacenter RPC for data-intensive datacenter applications to our knowledge. First, DmRPC introduces a DM-aware shared global address space to provide the semantics of pass by reference to datacenter RPC, thus alleviating the redundant data movement issue. Second, DmRPC adopts a copy-on-write mechanism to avoid complicating application logic to handle data consistency while guaranteeing high performance. We have applied DmRPC to two different implementations of DM, one is network-based (DmRPC-net) while the other is CXL-based (DmRPC-CXL). Our evaluations on synthetic 7-tier microservices workloads show that DmRPC-net (or DmRPC-CXL) achieves 4.2× (or 8.3×) higher throughput and achieves 1.1 × (or 1.7 ×) lower average latency than that of the baseline, respectively. On a widely used microservice benchmark DeathStarBench, DmRPC-net can achieve 3.1 × higher throughput and 2.5 × lower average latency than the baseline.

DmRPC: Disaggregated Memory-aware Datacenter RPC for Data-intensive Applications

SDSSE: A Self-Driven RPC Load Balancing Method in Datacenter

Rcmp: Reconstructing RDMA-Based Memory Disaggregation via CXL

Telepathic Datacenters: Fast RPCs using Shared CXL Memory

Remote Procedure Call as a Managed System Service

Scalable RDMA RPC on Reliable Connection with Efficient Resource Sharing

RF-RPC: Remote Fetching RPC Paradigm for RDMA-Enabled Network

RDMAvisor: Toward Deploying Scalable and Simple RDMA as a Service in Datacenters

Datacenter RPCs can be General and Fast

MC-RDMA: Improving Replication Performance of RDMA-based Distributed Systems with Reliable Multicast Support

RDMA-enabled Concurrency Control Protocols for Transactions in the Cloud Era

RPCAcc: A High-Performance and Reconfigurable PCIe-attached RPC Accelerator

ORCA: A Network and Architecture Co-design for Offloading us-scale Datacenter Applications

RFP: When RPC is Faster than Server-Bypass with RDMA.

DaeMon: Architectural Support for Efficient Data Movement in Disaggregated Systems

Direct Distributed Memory Access for CMPs

The case for distributed shared-memory databases with RDMA-enabled memory disaggregation

TH-DPMS: Design and Implementation of an RDMA-enabled Distributed Persistent Memory Storage System

Achieving Zero-copy Serialization for Datacenter RPC

Maximizing the Benefit of RDMA at End Hosts

RDMA Load Balancing via Data Partition