Federated Bayesian Network Ensembles

Florian van Daalen,Lianne Ippel,Andre Dekker,Inigo Bermejo
DOI: https://doi.org/10.1109/FMEC59375.2023.10306230
2024-02-19
Abstract:Federated learning allows us to run machine learning algorithms on decentralized data when data sharing is not permitted due to privacy concerns. Ensemble-based learning works by training multiple (weak) classifiers whose output is aggregated. Federated ensembles are ensembles applied to a federated setting, where each classifier in the ensemble is trained on one data location. In this article, we explore the use of federated ensembles of Bayesian networks (FBNE) in a range of experiments and compare their performance with locally trained models and models trained with VertiBayes, a federated learning algorithm to train Bayesian networks from decentralized data. Our results show that FBNE outperforms local models and provides a significant increase in training speed compared with VertiBayes while maintaining a similar performance in most settings, among other advantages. We show that FBNE is a potentially useful tool within the federated learning toolbox, especially when local populations are heavily biased, or there is a strong imbalance in population size across parties. We discuss the advantages and disadvantages of this approach in terms of time complexity, model accuracy, privacy protection, and model interpretability.
Machine Learning,Cryptography and Security
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper explores the application of Federated Bayesian Network Ensembles (FBNE) in different experiments and compares its performance with locally trained models and Bayesian networks trained using the VertiBayes algorithm. Specifically, the paper addresses the following issues: 1. **Handling Non-Independent and Identically Distributed Data**: Traditional federated learning methods assume that data across different participants is independent and identically distributed (IID). However, in reality, data from different hospitals or institutions may have biases. FBNE captures these local biases by integrating multiple locally trained Bayesian networks. 2. **Improving Training Speed**: FBNE demonstrates a significant training speed advantage over VertiBayes in most settings while maintaining similar performance. This is particularly useful for applications that require rapid modeling. 3. **Privacy Protection**: FBNE offers certain advantages in terms of privacy protection because it disperses information across multiple networks, making it difficult to infer relationships between all attributes from a single network. 4. **Applicability Assessment**: The paper evaluates the performance of FBNE on different datasets through a series of experiments and discusses its advantages and disadvantages relative to VertiBayes. The results show that in certain specific cases, FBNE outperforms VertiBayes. Overall, the paper aims to explore the potential of FBNE as an effective federated learning tool, especially in scenarios where data has strong biases or there are significant differences in the population sizes of the participants.