Abstract:Fault-tolerant replicated database systems consume less energy than the compute-intensive proof-of-work blockchain. Thus, they are promising technologies for the building blocks that assemble global financial infrastructure. To facilitate global scaling, clustered replication protocols are essential in orchestrating nodes into clusters based on proximity. However, the existing approaches often assume a homogeneous and fixed model in which the number of nodes across clusters is the same and fixed, and often limited to a fail-stop fault model. This paper presents heterogeneous and reconfigurable clustered replication for the general environment with arbitrary failures. In particular, we present AVA, a fault-tolerant reconfigurable geo-replication that allows dynamic membership: replicas are allowed to join and leave clusters. We formally state and prove the safety and liveness properties of the protocol. Furthermore, our replication protocol is consensus-agnostic, meaning each cluster can utilize any local replication mechanism. In our comprehensive evaluation, we instantiate our replication with both HotStuff and BFT-SMaRt. Experiments on geo-distributed deployments on Google Cloud demonstrates that members of clusters can be reconfigured without considerably affecting transaction processing, and that heterogeneity of clusters may significantly improve throughput.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are the deficiencies of existing Byzantine fault - tolerant replication protocols in terms of scalability and dynamic membership management. Specifically: 1. **Existing cluster replication protocols are usually homogeneous and fixed**: that is, the number of nodes in each cluster is the same and fixed, which limits the flexibility and adaptability of the system. 2. **Existing systems are often limited to specific fault models (such as the stop - fault model)**, and cannot handle a wider range of fault types. 3. **Lack of support for heterogeneous environments**: the number of active nodes in different regions may be different, but existing systems cannot support this heterogeneity well. 4. **Lack of dynamic membership management**: existing cluster replication protocols usually do not allow nodes to join or leave the cluster dynamically, which limits the decentralization and flexibility of the system. To solve these problems, the paper proposes AVA (Adaptive and Versatile Architecture), a fault - tolerant and reconfigurable geo - replication protocol that allows nodes to join and leave the cluster dynamically and supports cluster replication in heterogeneous environments. The main features of AVA include: - **Support for heterogeneous clusters**: different clusters can have different numbers of nodes, so as to better adapt to the resource distribution in different regions. - **Dynamic membership management**: allows nodes to join and leave the cluster dynamically, enhancing the flexibility and decentralization characteristics of the system. - **Security and liveness guarantees**: the security and liveness properties of the protocol are proved by formal methods to ensure the correct operation of the system in any fault situation. - **Consensus - independence**: each cluster can choose different local replication mechanisms (such as HotStuff and BFT - SMaRt), enhancing the generality and flexibility of the system. Through these improvements, AVA aims to provide an efficient, flexible and secure solution for global distributed systems, especially suitable for building global financial infrastructure.

AVA: Fault-tolerant Reconfigurable Geo-Replication on Heterogeneous Clusters

ResilientDB: Global Scale Resilient Blockchain Fabric

AWARE: Adaptive Wide-Area Replication for Fast and Resilient Byzantine Consensus

Fault Independence in Blockchain

Stabl: Blockchain Fault Tolerance

Fault-Tolerant Partial Replication in Large-Scale Database Systems

Scalable Byzantine Fault Tolerance via Partial Decentralization

Stabilizer: Geo-Replication with User-defined Consistency

Acb-R: An Adaptive Clustering-Based Data Replication Algorithm On A P2p Data-Store

FLAC: A Robust Failure-Aware Atomic Commit Protocol for Distributed Transactions

Reconfigurable Heterogeneous Quorum Systems

Fault Tolerant Consensus Agreement Algorithm

Voltran: Unlocking Trust and Confidentiality in Decentralized Federated Learning Aggregation

Heterogeneous Replicas for Multi-dimensional Data Management

Harmonia: Near-Linear Scalability for Replicated Storage with In-Network Conflict Detection

Toward Optimal-Complexity Hash-Based Asynchronous MVBA with Optimal Resilience

An Efficient Approach to Move Elements in a Distributed Geo-Replicated Tree

Scalable and Probabilistic Leaderless BFT Consensus through Metastability

BChain: Byzantine Replication with High Throughput and Embedded Reconfiguration

Fault Tolerant Adaptive Parallel and Distributed Simulation through Functional Replication