ScaleSFL: A Sharding Solution for Blockchain-Based Federated Learning

Evan Madill,Ben Nguyen,Carson K. Leung,Sara Rouhani
DOI: https://doi.org/10.1145/3494106.3528680
2022-04-04
Abstract:Blockchain-based federated learning has gained significant interest over the last few years with the increasing concern for data privacy, advances in machine learning, and blockchain innovation. However, gaps in security and scalability hinder the development of real-world applications. In this study, we propose ScaleSFL, which is a scalable blockchain-based sharding solution for federated learning. ScaleSFL supports interoperability by separating the off-chain federated learning component in order to verify model updates instead of controlling the entire federated learning flow. We implemented ScaleSFL as a proof-of-concept prototype system using Hyperledger Fabric to demonstrate the feasibility of the solution. We present a performance evaluation of results collected through Hyperledger Caliper benchmarking tools conducted on model creation. Our evaluation results show that sharding can improve validation performance linearly while remaining efficient and secure.
Cryptography and Security
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on the security and scalability issues in Blockchain - based Federated Learning (BFL). Specifically: 1. **Security issues**: Although the traditional federated learning framework can protect data privacy, there are two main security risks: - **Data leakage risk**: Model updates may indirectly leak the local data used to train these models, especially through statistical analysis and data mining techniques, and users may be re - anonymized. - **Single - point failure**: The centralized model aggregation method creates a single - point failure, relying on a trusted central server to aggregate model updates, which is an obvious weakness in a decentralized environment. 2. **Scalability issues**: As the number of participating nodes in the network increases, the computational complexity and communication overhead of traditional federated learning methods in verifying and aggregating model updates increase significantly, leading to a decline in system performance. Especially in large - scale distributed systems, this performance bottleneck is more obvious. To solve the above problems, the author proposes a sharding solution named ScaleSFL. The main features of ScaleSFL include: - **Sharding mechanism**: By dividing the network into multiple shards, each shard independently verifies and partially aggregates model updates, thereby reducing the global computational amount and communication overhead. - **Committee consensus**: A committee is elected in each shard, which is responsible for verifying model updates and generating new blocks. Committee members ensure the validity and security of model updates through a local consensus mechanism. - **Flexible security strategy**: ScaleSFL supports pluggable poisoning mitigation and detection strategies to deal with false or harmful model updates submitted by malicious clients. Through these designs, ScaleSFL aims to improve the security and scalability of the blockchain - based federated learning system, enabling it to operate more effectively in practical applications.