Abstract:Random forests are considered a cornerstone in machine learning for their robustness and versatility. Despite these strengths, their conventional centralized training is ill-suited for the modern landscape of data that is often distributed, sensitive, and subject to privacy concerns. Federated learning (FL) provides a compelling solution to this problem, enabling models to be trained across a group of clients while maintaining the privacy of each client's data. However, adapting tree-based methods like random forests to federated settings introduces significant challenges, particularly when it comes to non-identically distributed (non-IID) data across clients, which is a common scenario in real-world applications. This paper presents a federated random forest approach that employs a novel ensemble construction method aimed at improving performance under non-IID data. Instead of growing trees independently in each client, our approach ensures each decision tree in the ensemble is iteratively and collectively grown across clients. To preserve the privacy of the client's data, we confine the information stored in the leaf nodes to the majority class label identified from the samples of the client's local data that reach each node. This limited disclosure preserves the confidentiality of the underlying data distribution of clients, thereby enhancing the privacy of the federated learning process. Furthermore, our collaborative ensemble construction strategy allows the ensemble to better reflect the data's heterogeneity across different clients, enhancing its performance on non-IID data, as our experimental results confirm.

Lightweight Privacy-Preserving Federated Incremental Decision Trees

Privacy Preserving Vertical Federated Learning for Tree-based Models

Federated Extra-Trees with Privacy Preserving

PriVDT: An Efficient Two-Party Cryptographic Framework for Vertical Decision Trees

Fed-EINI: An Efficient and Interpretable Inference Framework for Decision Tree Ensembles in Federated Learning

Federated Boosted Decision Trees with Differential Privacy

Decision Tree-Based Federated Learning: A Survey

FDPBoost: Federated differential privacy gradient boosting decision trees

FederBoost: Private Federated Learning for GBDT

Performance-Enhanced Federated Learning with Differential Privacy for Internet of Things

Efficient Encrypted Inference on Ensembles of Decision Trees

Eliminating Label Leakage in Tree-Based Vertical Federated Learning

Effective and Efficient Federated Tree Learning on Hybrid Data

Efficient and Privacy-Preserving Tree-Based Inference via Additive Homomorphic Encryption

A collaborative ensemble construction method for federated random forest

SGBoost: An Efficient and Privacy-Preserving Vertical Federated Tree Boosting Framework

OpBoost

FedRec: Privacy-Preserving News Recommendation with Federated Learning

Efficient-FedRec: Efficient Federated Learning Framework for Privacy-Preserving News Recommendation

An Efficient and Robust System for Vertically Federated Random Forest

Federated Transfer Learning with Differential Privacy