Optimizing Distributed Protocols with Query Rewrites [Technical Report]

David Chu,Rithvik Panchapakesan,Shadaj Laddad,Lucky Katahanas,Chris Liu,Kaushik Shivakumar,Natacha Crooks,Joseph M. Hellerstein,Heidi Howard
DOI: https://doi.org/10.1145/3639257
2024-04-03
Abstract:Distributed protocols such as 2PC and Paxos lie at the core of many systems in the cloud, but standard implementations do not scale. New scalable distributed protocols are developed through careful analysis and rewrites, but this process is ad hoc and error-prone. This paper presents an approach for scaling any distributed protocol by applying rule-driven rewrites, borrowing from query optimization. Distributed protocol rewrites entail a new burden: reasoning about spatiotemporal correctness. We leverage order-insensitivity and data dependency analysis to systematically identify correct coordination-free scaling opportunities. We apply this analysis to create preconditions and mechanisms for coordination-free decoupling and partitioning, two fundamental vertical and horizontal scaling techniques. Manual rule-driven applications of decoupling and partitioning improve the throughput of 2PC by $5\times$ and Paxos by $3\times$, and match state-of-the-art throughput in recent work. These results point the way toward automated optimizers for distributed protocols based on correct-by-construction rewrite rules.
Distributed, Parallel, and Cluster Computing,Databases
What problem does this paper attempt to address?