Abstract:Gradient-boosting decision forests, as used by algorithms such as XGBoost or AdaBoost, offer higher accuracy and lower training times for large datasets than decision trees. Protocols for private inference over decision trees can be used to preserve the privacy of the input data as well as the privacy of the trees. However, naively extending private inference over decision trees to private inference over decision forests by replicating the protocols leads to impractical running times. In this paper, we explore extending the private decision inference protocol using homomorphic encryption by Mahdavi et al. (CCS 2023) to decision forests. We present several optimizations that identify and then remove (approximate) duplication between the trees in a forest and hence achieve significant improvements in communication and computation cost over the naive approach. To the best of our knowledge, we present the first private inference protocol for highly scalable gradient-boosting decision forests. Our optimizations extend beyond Mahdavi et al.'s protocol to various private inference protocols for gradient-boosting decision trees. Our protocol's inference time is faster than the baseline of parallel running the protocol by Mahdavi et al.~by up to 28.1x, and faster than Zama's Concrete ML XGBoost by up to 122.25x.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the efficiency and performance problems encountered during privacy inference on Gradient - Boosting Decision Forests (GBDTs). Specifically, it attempts to solve the following two main problems: 1. **Inefficiency when existing protocols are extended to GBDTs**: - A gradient - boosting decision forest consists of multiple decision trees, and the results of these trees need to be combined into a final prediction result. Existing privacy inference protocols are mainly designed for a single decision tree. Directly extending these protocols to GBDTs will lead to impractical running times because the computational complexity and communication overhead increase significantly. - When using Homomorphic Encryption (HE) for privacy inference, since it is necessary to handle the combined results of multiple decision trees, the multiplicative depth of the circuit is increased, thus requiring larger parameters and leading to a substantial increase in resource requirements. 2. **Optimizing the inference process of multiple trees**: - Even without combining the results of multiple trees, simply running the privacy inference of multiple decision trees independently will exhaust the current computing resources. Therefore, optimization measures need to be introduced to reduce the cost of performing multiple decision - tree evaluations on the same input. To solve these problems, the paper proposes a new protocol - SilentWood. SilentWood achieves significant performance improvements through the following methods: - **Blind Code Conversion Protocol**: It is used to convert indicator bits with different encodings during the inference process, ensuring that homomorphically encrypted data can be correctly aggregated. - **Identifying and removing duplicates between trees in the forest**: By discovering and removing (approximate) duplicate parts between trees in the forest, the overall computation and communication costs are reduced. - **Optimizing the reuse of input data**: By identifying and removing duplicates in the input to the decision forest, the communication volume is significantly reduced, and the computation cost is also reduced. Through these optimizations, the inference time of SilentWood is up to 28.1 times faster than the baseline method (i.e., simply repeating the protocol proposed by Mahdavi et al.), and up to 122.25 times faster than Zama's Concrete ML XGBoost. ### Summary The main contribution of this paper is to provide an efficient and scalable method for performing privacy inference on GBDTs, solving the shortcomings of existing methods in scalability and efficiency, thereby making machine - learning inference under privacy protection more practical.

SilentWood: Private Inference Over Gradient-Boosting Decision Forests

SoK: Modular and Efficient Private Decision Tree Evaluation.

A Hybrid-Domain Framework for Secure Gradient Tree Boosting.

Securely and Efficiently Outsourcing Decision Tree Inference

Differentially Private Greedy Decision Forest

FDPBoost: Federated differential privacy gradient boosting decision trees

Federated Boosted Decision Trees with Differential Privacy

OnePath: Efficient and Privacy-Preserving Decision Tree Inference in the Cloud

Efficient and Privacy-Preserving Tree-Based Inference via Additive Homomorphic Encryption

eFL-Boost: Efficient Federated Learning for Gradient Boosting Decision Trees

Efficient Encrypted Inference on Ensembles of Decision Trees

Poster: gbdt-rs: Fast and Trustworthy Gradient Boosting Decision Tree

Privet: A Privacy-Preserving Vertical Federated Learning Service for Gradient Boosted Decision Tables

Level Up: Private Non-Interactive Decision Tree Evaluation using Levelled Homomorphic Encryption

PriVDT: An Efficient Two-Party Cryptographic Framework for Vertical Decision Trees

Secure Collaborative Training and Inference for XGBoost

A(DP)$^2$SGD: Asynchronous Decentralized Parallel Stochastic Gradient Descent with Differential Privacy

SecureBoost+: Large Scale and High-Performance Vertical Federated Gradient Boosting Decision Tree

Vectorized Secure Evaluation of Decision Forests

FederBoost: Private Federated Learning for GBDT

GTree: GPU-Friendly Privacy-preserving Decision Tree Training and Inference