Modeling the Raft Distributed Consensus Protocol in LNT

Hugues Evrard
DOI: https://doi.org/10.4204/EPTCS.316.2
2020-04-28
Abstract:Consensus protocols are crucial for reliable distributed systems as they let them cope with network and server failures. For decades, most consensus protocols have been designed as variations of the seminal Paxos, yet in 2014 Raft was presented as a new, "understandable" protocol, meant to be easier to implement than the notoriously subtle Paxos family. Raft has since been used in various industrial projects, e.g. Hashicorp's Consul or etcd (used by Google's Kubernetes). The correctness of Raft is established via a manual proof, based on a TLA+ specification of the protocol. This paper reports our experience in modeling Raft in the LNT process algebra. We found a couple of issues with the original TLA+ specification of Raft, which has been corrected since. More generally, this exercise offers a great opportunity to discuss how to best use the features of the LNT formal language and the associated CADP verification toolbox to model distributed protocols, including network and server failures.
Software Engineering,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?