Abstract:Privacy protection has attracted increasing attention, and privacy concerns often prevent flexible data utilization. In most industries, data are distributed across multiple organizations due to privacy concerns. Federated learning (FL), which enables cross-organizational machine learning by communicating statistical information, is a state-of-the-art technology that is used to solve this problem. However, for gradient boosting decision tree (GBDT) in FL, balancing communication efficiency and security while maintaining sufficient accuracy remains an unresolved problem. In this paper, we propose an FL scheme for GBDT, i.e., efficient FL for GBDT (eFL-Boost), which minimizes accuracy loss, communication costs, and information leakage. The proposed scheme focuses on appropriate allocation of local computation (performed individually by each organization) and global computation (performed cooperatively by all organizations) when updating a model. It is known that tree structures incur high communication costs for global computation, whereas leaf weights do not require such costs and are expected to contribute relatively more to accuracy. Thus, in the proposed eFL-Boost, a tree structure is determined locally at one of the organizations, and leaf weights are calculated globally by aggregating the local gradients of all organizations. Specifically, eFL-Boost requires only three communications per update, and only statistical information that has low privacy risk is leaked to other organizations. Through performance evaluation on public data sets (ROC AUC, Log loss, and F1-score are used as metrics), the proposed eFL-Boost outperforms existing schemes that incur low communication costs and was comparable to a scheme that offers no privacy protection.

VF2Boost: Very Fast Vertical Federated Gradient Boosting for Cross-Enterprise Learning

FederBoost: Private Federated Learning for GBDT

SecureBoost+: Large Scale and High-Performance Vertical Federated Gradient Boosting Decision Tree

FedGBF: An efficient vertical federated learning framework via gradient boosting and bagging

SGBoost: An Efficient and Privacy-Preserving Vertical Federated Tree Boosting Framework

A Hybrid-Domain Framework for Secure Gradient Tree Boosting.

Gradient-less Federated Gradient Boosting Trees with Learnable Learning Rates

eFL-Boost: Efficient Federated Learning for Gradient Boosting Decision Trees

Large-scale Secure XGB for Vertical Federated Learning

OpBoost

Privet: A Privacy-Preserving Vertical Federated Learning Service for Gradient Boosted Decision Tables

Boosting Privately: Federated Extreme Gradient Boosting for Mobile Crowdsensing

OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization

SecureBoost: A Lossless Federated Learning Framework

A fault‐tolerant and scalable boosting method over vertically partitioned data

FedEmb: A Vertical and Hybrid Federated Learning Algorithm using Network And Feature Embedding Aggregation

FedV: Privacy-Preserving Federated Learning over Vertically Partitioned Data

Secure and fast asynchronous Vertical Federated Learning via cascaded hybrid optimization

Adaptive Histogram-Based Gradient Boosted Trees for Federated Learning

FDPBoost: Federated differential privacy gradient boosting decision trees

Efficient Vertical Federated Unlearning via Fast Retraining