Certifiably Byzantine-Robust Federated Conformal Prediction

Mintong Kang,Zhen Lin,Jimeng Sun,Cao Xiao,Bo Li
2024-06-04
Abstract:Conformal prediction has shown impressive capacity in constructing statistically rigorous prediction sets for machine learning models with exchangeable data samples. The siloed datasets, coupled with the escalating privacy concerns related to local data sharing, have inspired recent innovations extending conformal prediction into federated environments with distributed data samples. However, this framework for distributed uncertainty quantification is susceptible to Byzantine failures. A minor subset of malicious clients can significantly compromise the practicality of coverage guarantees. To address this vulnerability, we introduce a novel framework Rob-FCP, which executes robust federated conformal prediction, effectively countering malicious clients capable of reporting arbitrary statistics with the conformal calibration process. We theoretically provide the conformal coverage bound of Rob-FCP in the Byzantine setting and show that the coverage of Rob-FCP is asymptotically close to the desired coverage level. We also propose a malicious client number estimator to tackle a more challenging setting where the number of malicious clients is unknown to the defender and theoretically shows its effectiveness. We empirically demonstrate the robustness of Rob-FCP against diverse proportions of malicious clients under a variety of Byzantine attacks on five standard benchmark and real-world healthcare datasets.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the impact of malicious clients on prediction coverage in the Federated Conformal Prediction (FCP) framework within a Byzantine fault tolerance environment. Specifically, traditional FCP methods experience a significant decline in prediction coverage guarantees when faced with a small number of malicious clients. The paper proposes a new framework—Robust Federated Conformal Prediction (Rob-FCP)—aimed at effectively resisting the influence of malicious clients and ensuring high prediction coverage even in the presence of malicious behavior. ### Background and Problem - **Background**: Conformal Prediction is a statistical method used to construct prediction sets for machine learning models that can contain the true outcome with a certain probability. With the need for data privacy and distributed systems, Federated Learning has become an effective solution, allowing multiple clients to collaboratively train models without sharing raw data. - **Problem**: In a federated environment, malicious or negligent clients may submit incorrect data statistics, causing the prediction coverage guarantees to fail. For example, hospital data may be contaminated due to human negligence or intentional tampering, affecting the training and testing of the global model. ### Solution - **Rob-FCP Framework**: This framework improves the robustness of prediction coverage through the following steps: 1. **Detecting Malicious Clients**: Calculate the non-conformity score for each client and represent it as a feature vector. Identify malicious clients by computing the distances between these vectors. 2. **Excluding Malicious Clients**: Exclude the statistics of detected malicious clients when calculating empirical quantiles. 3. **Performing Federated Conformal Prediction**: Use the quantile values unaffected by malicious clients to perform distributed conformal prediction. ### Theoretical Analysis - **Coverage Guarantee**: The paper provides a theoretical coverage guarantee for Rob-FCP in a Byzantine environment. When the number of malicious clients is less than the number of benign clients and the sample size of benign clients is sufficiently large, the coverage of Rob-FCP can approach the expected coverage level. - **Estimation of Malicious Client Count**: A malicious client count estimator is proposed to enhance the system's robustness when the number of malicious clients is unknown. ### Experimental Validation - **Experimental Results**: The paper conducts experiments on multiple standard benchmark datasets and real-world medical datasets to validate the robustness and effectiveness of Rob-FCP under various proportions of malicious clients and different Byzantine attacks. The results show that Rob-FCP can maintain high prediction coverage and efficiency even in the presence of malicious clients. ### Main Contributions 1. Proposes the first robust federated conformal prediction framework (Rob-FCP) in a Byzantine environment. 2. Designs an effective method for detecting malicious clients and an estimator for the number of malicious clients. 3. Provides theoretical coverage guarantees and analyzes the accuracy of the malicious client count estimator. 4. Conducts experiments on multiple datasets to demonstrate the robustness and effectiveness of Rob-FCP. Through these contributions, the paper provides an important theoretical and practical foundation for improving the reliability and robustness of model predictions in distributed and privacy-preserving federated learning environments.