Abstract:User-generated data is crucial to predictive modeling in many applications. With a web/mobile/wearable interface, a data owner can continuously record data generated by distributed users and build various predictive models from the data to improve its operations, services, and revenue. Due to the large size and evolving nature of users data, a data owner may rely on public cloud service providers (Cloud) for storage and computation scalability. Exposing sensitive user-generated data and advanced analytic models to Cloud raises privacy concerns. We present a confidential learning framework, SecureBoost, for data owners that want to learn predictive models from aggregated user-generated data but offload the storage and computational burden to Cloud without having to worry about protecting the sensitive data. SecureBoost allows users to submit encrypted or randomly masked data to designated Cloud directly. Our framework utilizes random linear classifiers (RLCs) as the base classifiers in the boosting framework to dramatically simplify the design of the proposed confidential protocols, yet still preserve the model quality. A Cryptographic Service Provider (CSP) is used to assist the Cloud’s processing, reducing the complexity of the protocol constructions. We present two constructions of SecureBoost: HE+GC and SecSh+GC, using combinations of homomorphic encryption, garbled circuits, and random masking to achieve both security and efficiency. For a boosted model, Cloud learns only the RLCs and the CSP learns only the weights of the RLCs. Finally, the data owner collects the two parts to get the complete model. We conduct extensive experiments to understand the quality of the RLC-based boosting and the cost distribution of the constructions. Our results show that SecureBoost can efficiently learn high-quality boosting models from protected user-generated data.

SecureBoost: A Lossless Federated Learning Framework

A Hybrid-Domain Framework for Secure Gradient Tree Boosting.

SGBoost: An Efficient and Privacy-Preserving Vertical Federated Tree Boosting Framework

SecureBoost+: Large Scale and High-Performance Vertical Federated Gradient Boosting Decision Tree

OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization

OpBoost

FDPBoost: Federated differential privacy gradient boosting decision trees

SecureBoost Hyperparameter Tuning via Multi-Objective Federated Learning

eFL-Boost: Efficient Federated Learning for Gradient Boosting Decision Trees

FederBoost: Private Federated Learning for GBDT

A Secure Federated Transfer Learning Framework

Gradient-less Federated Gradient Boosting Trees with Learnable Learning Rates

Federated Extra-Trees with Privacy Preserving

VF2Boost: Very Fast Vertical Federated Gradient Boosting for Cross-Enterprise Learning

Privacy-Preserving Boosting in the Local Setting

Confidential Boosting with Random Linear Classifiers for Outsourced User-Generated Data

Hyperparameter Optimization for SecureBoost via Constrained Multi-Objective Federated Learning

FedGBF: An efficient vertical federated learning framework via gradient boosting and bagging

An Efficient and Robust System for Vertically Federated Random Forest

Privet: A Privacy-Preserving Vertical Federated Learning Service for Gradient Boosted Decision Tables