Federated Medical Learning Framework Based on Blockchain and Homomorphic Encryption

Xiaohui Yang,Chongbo Xing
DOI: https://doi.org/10.1155/2024/8138644
2024-01-07
Wireless Communications and Mobile Computing
Abstract:Federated learning-based medical data privacy sharing can promote the development of medical industry intelligence, but limited by its own security and privacy deficiencies, federated learning still suffers from a single point of failure and privacy leakage of intermediate parameters. To address these problems, this paper proposes a privacy protection framework for medical data based on blockchain and cross-silo federated learning, using cross-silo federated learning to establish a collaborative training platform for multiple medical institutions to enhance the privacy of medical data, introducing blockchain and smart contracts to realize decentralized federated learning to enhance trust between distrustful medical institutions and solve the problem of a single point of failure. In addition, a secure aggregation scheme is designed using threshold homomorphic encryption to prevent the privacy leakage problem during parameter transmission. The experimental and analytical results show that the accuracy of this paper's scheme is consistent with the original federated learning scheme, effectively deals with the problems of single-point failure and inference attacks of federated learning, improves system robustness, and is suitable for medical scenarios with more stringent requirements on security and accuracy.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are privacy protection and security issues in medical data sharing, especially in the Federated Learning (FL) framework. Specifically: 1. **Trust issue**: Due to the lack of trust among medical institutions participating in federated learning, a secure cooperation mechanism needs to be established to ensure data security and privacy. 2. **Single - point - of - failure issue**: The traditional federated learning framework relies on a fixed central server as an aggregator, which may lead to training failure due to device security attacks or physical damage. 3. **Inference attack**: The central server can infer the user's original data by analyzing local model updates, thereby revealing patients' privacy. To solve these problems, this paper proposes a federated medical learning framework based on blockchain and homomorphic encryption. The main contributions are as follows: - **Propose a medical data privacy protection framework based on blockchain and federated learning**, which not only provides a secure and trustworthy data - sharing platform for medical data but also makes it tamper - proof and auditable. - **Design a secure aggregation scheme based on threshold homomorphic encryption** to ensure the secure aggregation of model parameters and prevent the leakage of local data privacy during transmission. - **Design smart contracts for secure upload and aggregation node selection**, which solves the single - point - of - failure problem in the federated learning process through the dual guarantees of blockchain and smart contracts, and uses the IPFS file system to reduce the storage pressure on the blockchain. - **Test and evaluate the proposed framework**, proving that it improves the privacy and security of medical data sharing while maintaining accuracy comparable to traditional federated learning schemes. ### Formula Summary Some of the key formulas involved in the paper are as follows: 1. **Model parameter update formula**: \[ \omega_i^t=\omega_i^{t - 1}-\eta_i\nabla F_i(\omega_i^{t - 1}) \] \[ LM(i)=\frac{n_i}{n}\omega_i^t \] where \(\omega_i^{t - 1}\) represents the initial model parameters or the global model parameters decoded in the previous round, \(\eta_i\) is the learning rate of the \(i\)-th participant, and \(\nabla F_i(\omega_i^{t - 1})\) is the gradient of the \(i\)-th participant. 2. **Encrypted local model parameters**: \[ ELM(i)=g^{LM(i)x_i}\bmod n^2 \] 3. **Aggregated global model parameters**: \[ EGM = \prod_{i = 1}^M ELM(i)=g^{\sum LM(i)\prod x_i}\bmod n^2 \] 4. **Partially decrypted model parameters**: \[ PGM(i)=EGM^{SK_{CDO_i}}\bmod n^2 \] 5. **Final decryption result**: \[ DGM=\left(\sum_{i = 1}^M LM(i)=L\left(\prod_{j\in S}PGM(j)\right)^{\mu_S j}\bmod n^2\times\frac{4M!}{2\theta}\right)^{-1}\bmod n \] where \(L(x)=\frac{x - 1}{n}\), \(\mu_S j = M!\times\prod_{j'\in S\setminus\{j\}}\frac{j'}{j' - j}\in\mathbb{Z}\). Through these methods and techniques, the framework proposed in this paper effectively solves the privacy and security problems in medical data sharing.