Abstract:Decision tree (DT) is a widely used machine learning model due to its versatility, speed, and interpretability. However, for privacy-sensitive applications, outsourcing DT training and inference to cloud platforms raise concerns about data privacy. Researchers have developed privacy-preserving approaches for DT training and inference using cryptographic primitives, such as Secure Multi-Party Computation (MPC). While these approaches have shown progress, they still suffer from heavy computation and communication overheads. Few recent works employ Graphical Processing Units (GPU) to improve the performance of MPC-protected deep learning. This raises a natural question: \textit{can MPC-protected DT training and inference be accelerated by GPU?} We present GTree, the first scheme that uses GPU to accelerate MPC-protected secure DT training and inference. GTree is built across 3 parties who securely and jointly perform each step of DT training and inference with GPU. Each MPC protocol in GTree is designed in a GPU-friendly version. The performance evaluation shows that GTree achieves ${\thicksim}11{\times}$ and ${\thicksim}21{\times}$ improvements in training SPECT and Adult datasets, compared to the prior most efficient CPU-based work. For inference, GTree shows its superior efficiency when the DT has less than 10 levels, which is $126\times$ faster than the prior most efficient work when inferring $10^4$ instances with a tree of 7 levels. GTree also achieves a stronger security guarantee than prior solutions, which only leaks the tree depth and size of data samples while prior solutions also leak the tree structure. With \textit{oblivious array access}, the access pattern on GPU is also protected.

Efficient Gradient Boosted Decision Tree Training on GPUs

HarpGBDT: Optimizing Gradient Boosting Decision Tree for Parallel Efficiency

GPU-acceleration for Large-scale Tree Boosting

Benchmarking and Optimization of Gradient Boosting Decision Tree Algorithms

XGBoost: Scalable GPU Accelerated Learning

Challenges and Opportunities of Building Fast GBDT Systems.

Gradient Boosting With Piece-Wise Linear Regression Trees

Quantized Training of Gradient Boosting Decision Trees

Poster: gbdt-rs: Fast and Trustworthy Gradient Boosting Decision Tree

Parallel L-BFGS-B Algorithm on GPU.

Parallel Training GBRT Based on KMeans Histogram Approximation for Big Data.

Out-of-Core GPU Gradient Boosting

Parallel Inference for Latent Dirichlet Allocation on Graphics Processing Units.

DimBoost

TencentBoost: A Gradient Boosting Tree System with Parameter Server

An experimental evaluation of large scale GBDT systems

SecureBoost+: Large Scale and High-Performance Vertical Federated Gradient Boosting Decision Tree

GTree: GPU-Friendly Privacy-preserving Decision Tree Training and Inference

Unbiased Gradient Boosting Decision Tree with Unbiased Feature Importance

MT-GBM: A Multi-Task Gradient Boosting Machine with Shared Decision Trees

FederBoost: Private Federated Learning for GBDT