DCT-CryptoNets: Scaling Private Inference in the Frequency Domain

Arjun Roy,Kaushik Roy
2024-08-28
Abstract:The convergence of fully homomorphic encryption (FHE) and machine learning offers unprecedented opportunities for private inference of sensitive data. FHE enables computation directly on encrypted data, safeguarding the entire machine learning pipeline, including data and model confidentiality. However, existing FHE-based implementations for deep neural networks face significant challenges in computational cost, latency, and scalability, limiting their practical deployment. This paper introduces DCT-CryptoNets, a novel approach that leverages frequency-domain learning to tackle these issues. Our method operates directly in the frequency domain, utilizing the discrete cosine transform (DCT) commonly employed in JPEG compression. This approach is inherently compatible with remote computing services, where images are usually transmitted and stored in compressed formats. DCT-CryptoNets reduces the computational burden of homomorphic operations by focusing on perceptually relevant low-frequency components. This is demonstrated by substantial latency reduction of up to 5.3$\times$ compared to prior work on image classification tasks, including a novel demonstration of ImageNet inference within 2.5 hours, down from 12.5 hours compared to prior work on equivalent compute resources. Moreover, DCT-CryptoNets improves the reliability of encrypted accuracy by reducing variability (e.g., from $\pm$2.5\% to $\pm$1.0\% on ImageNet). This study demonstrates a promising avenue for achieving efficient and practical privacy-preserving deep learning on high resolution images seen in real-world applications.
Cryptography and Security,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problems of high computational cost, large latency and poor scalability in the application of fully homomorphic encryption (FHE) in deep neural networks (DNN). Specifically, the existing FHE - based deep - learning inference methods face significant computational overhead and latency when processing large - scale datasets (such as ImageNet), which limits their practical deployment. #### Main problems include: 1. **High computational cost**: Traditional FHE schemes rely on polynomial approximation to implement non - linear activation functions (such as ReLU), which leads to cumulative errors, and as the network depth increases, the computational cost grows exponentially. 2. **Large latency**: Existing methods have an overly long inference time when processing large - scale image datasets. For example, the inference time of ResNet - 18 on ImageNet is 2.5 days. 3. **Poor scalability**: Existing FHE schemes are difficult to scale to larger networks and higher - resolution images, because these situations will introduce more convolution and non - linear activation operations, further increasing latency and computational burden. ### Solution: To solve the above problems, this paper proposes **DCT - CryptoNets**, a new method based on the discrete cosine transform (DCT), which reduces the computational burden by learning in the frequency domain. Specific improvements include: 1. **Frequency - domain learning**: Use DCT to convert the image into a frequency - domain representation, focusing on the low - frequency components, thereby reducing the computational burden brought by the perceptually - irrelevant high - frequency information. 2. **Reduce non - linear activation operations**: By reducing the ReLU operations in the early layers, reduce the noise accumulation in homomorphic encryption, and then reduce the need for homomorphic bootstrapping. 3. **Improve inference efficiency**: The inference time of DCT - CryptoNets on ImageNet is reduced from 12.5 hours to 2.5 hours, while improving the accuracy and stability of encrypted inference and reducing error fluctuations (from ±2.5% to ±1.0%). ### Summary: DCT - CryptoNets significantly improves the efficiency and scalability of FHE - based deep - learning inference by optimizing the neural network structure in the frequency domain and reducing unnecessary computational operations, especially when dealing with high - resolution images and large - scale datasets.