Abstract:As machine learning (ML) permeates fields like healthcare, facial recognition, and blockchain, the need to protect sensitive data intensifies. Fully Homomorphic Encryption (FHE) allows inference on encrypted data, preserving the privacy of both data and the ML model. However, it slows down non-secure inference by up to five magnitudes, with a root cause of replacing non-polynomial operators (ReLU and MaxPooling) with high-degree Polynomial Approximated Function (PAF). We propose SmartPAF, a framework to replace non-polynomial operators with low-degree PAF and then recover the accuracy of PAF-approximated model through four techniques: (1) Coefficient Tuning (CT) -- adjust PAF coefficients based on the input distributions before training, (2) Progressive Approximation (PA) -- progressively replace one non-polynomial operator at a time followed by a fine-tuning, (3) Alternate Training (AT) -- alternate the training between PAFs and other linear operators in the decoupled manner, and (4) Dynamic Scale (DS) / Static Scale (SS) -- dynamically scale PAF input value within (-1, 1) in training, and fix the scale as the running max value in FHE deployment. The synergistic effect of CT, PA, AT, and DS/SS enables SmartPAF to enhance the accuracy of the various models approximated by PAFs with various low degrees under multiple datasets. For ResNet-18 under ImageNet-1k, the Pareto-frontier spotted by SmartPAF in latency-accuracy tradeoff space achieves 1.42x ~ 13.64x accuracy improvement and 6.79x ~ 14.9x speedup than prior works. Further, SmartPAF enables a 14-degree PAF (f1^2 g_1^2) to achieve 7.81x speedup compared to the 27-degree PAF obtained by minimax approximation with the same 69.4% post-replacement accuracy. Our code is available at

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the performance bottleneck problem encountered during machine learning (ML) inference in the fully homomorphic encryption (FHE) environment. Specifically, when using FHE to encrypt sensitive data and perform ML model inference on it, the processing of non - polynomial operators (such as ReLU and MaxPooling) will cause significant latency, making the inference speed five orders of magnitude slower than the non - secure version. To improve the inference speed and maintain high accuracy, the authors propose a new framework called SMART - PAF, which replaces these non - polynomial operators with low - order polynomial approximation functions (PAF). #### Main challenges 1. **Processing of non - polynomial operators**: - FHE does not support the direct processing of non - polynomial operators, so these operators need to be replaced with polynomial approximation functions. - Although high - order PAF can improve accuracy, it will introduce too many multiplication operations, resulting in increased latency. - Although low - order PAF can reduce latency, it may lead to a decrease in accuracy. 2. **Limitations of existing methods**: - Hybrid Scheme: By offloading non - polynomial operators to other secure schemes, but this will lead to excessive communication overhead. - Existing PAF approximation methods are difficult to converge under high - order PAF, and low - order PAF performs poorly in complex tasks. #### Solutions To overcome the above challenges, the authors propose the SMART - PAF framework, which includes the following four key techniques: 1. **Coefficient Tuning (CT)**: - Adjust the coefficients of PAF according to the input distribution to reduce approximation error and improve model accuracy. 2. **Progressive Approximation (PA)**: - Replace non - polynomial operators step by step and perform fine - tuning after each replacement to ensure the convergence of the training process. 3. **Alternate Training (AT)**: - Train PAF coefficients and other linear layer parameters separately to avoid mutual interference during the training process and improve the convergence speed and accuracy. 4. **Dynamic Scaling / Static Scaling (DS / SS)**: - Dynamically scale the input values to the range of [-1, 1] during the training process to improve the approximation accuracy; fix the scaling factor during deployment to meet the requirements of FHE. Through these techniques, SMART - PAF can achieve higher accuracy and faster inference speed on multiple datasets and models. For example, on ResNet - 18, SMART - PAF can achieve the same 69.4% validation accuracy as the 27 - order PAF while shortening the inference latency by 7.81 times. ### Summary The main goal of this paper is to solve the performance bottleneck problem caused by the processing of non - polynomial operators in the FHE environment by proposing the SMART - PAF framework, thereby achieving fast and accurate private inference.

Accurate Low-Degree Polynomial Approximation of Non-polynomial Operators for Fast Private Inference in Homomorphic Encryption

Projected Federated Averaging with Heterogeneous Differential Privacy.

Optimized Layerwise Approximation for Efficient Private Inference on Fully Homomorphic Encryption

CHEETAH: An Ultra-Fast, Approximation-Free, and Privacy-Preserved Neural Network Framework based on Joint Obscure Linear and Nonlinear Computations

Batch-oriented Element-wise Approximate Activation for Privacy-Preserving Neural Networks

MOFHEI: Model Optimizing Framework for Fast and Efficient Homomorphically Encrypted Neural Network Inference

Privacy-Preserving Machine Learning With Fully Homomorphic Encryption for Deep Neural Network

Optimized Privacy-Preserving CNN Inference With Fully Homomorphic Encryption

When approximate design for fast homomorphic computation provides differential privacy guarantees

A Simple Solution for Homomorphic Evaluation on Large Intervals

SHAPER: A General Architecture for Privacy-Preserving Primitives in Secure Machine Learning.

Efficient Privacy-Preserving KAN Inference Using Homomorphic Encryption

Towards Fast and Scalable Private Inference

Neural Networks with (Low-Precision) Polynomial Approximations: New Insights and Techniques for Accuracy Improvement

Chemokine Signatures in the Skin Disorders of Lyme Borreliosis in Europe: Predominance of CXCL9 and CXCL10 in Erythema Migrans and Acrodermatitis and CXCL13 in Lymphocytoma

AutoReP: Automatic ReLU Replacement for Fast Private Network Inference

Blind Evaluation Framework for Fully Homomorphic Encryption and Privacy-Preserving Machine Learning

Faster CryptoNets: Leveraging Sparsity for Real-World Encrypted Inference

Toward Practical Privacy-Preserving Convolutional Neural Networks Exploiting Fully Homomorphic Encryption

Privacy Preserving Inference for Deep Neural Networks: Optimizing Homomorphic Encryption for Efficient and Secure Classification

SHE: A Fast and Accurate Deep Neural Network for Encrypted Data