Fast and Accurate Homomorphic Softmax Evaluation

Wonhee Cho,Guillaume Hanrot,Taeseong Kim,Minje Park,Damien Stehlé
DOI: https://doi.org/10.1145/3658644.3670369
2024-10-15
Abstract:Homomorphic encryption is one of the main solutions for building secure and privacy-preserving solutions for Machine Learning as a Service. This motivates the development of homomorphic algorithms for the main building blocks of AI, typically for the components of the various types of neural networks architectures. Among those components, we focus on the Softmax function, defined by $\mathrm{SM}(\mathbf{x}) = \left(\exp(x_i) / \sum_{j=1}^n \exp(x_j) \right)_{1\le i\le n}$. This function is deemed to be one of the most difficult to evaluate homomorphically, because of its multivariate nature and of the very large range of values for $\exp(x_i)$. The available homomorphic algorithms remain restricted, especially in large dimensions, while important applications such as Large Language Models (LLM) require computing Softmax over large dimensional vectors. In terms of multiplicative depth of the computation (a suitable measure of cost for homomorphic algorithms), our algorithm achieves $O(\log n)$ complexity for a fixed range of inputs, where $n$ is the Softmax dimension. Our algorithm is especially adapted to the situation where we must compute many Softmax at the same time, for instance, in the LLM situation. In that case, assuming that all Softmax calls are packed into $m$ ciphtertexts, the asymptotic amortized multiplicative depth cost per ciphertext is, again over a fixed range, $O(1 + m/N)$ for $N$ the homomorphic ring degree. The main ingredient of our algorithms is a normalize-and-square strategy, which interlaces the exponential computation over a large range and normalization, decomposing both in stabler and cheaper smaller steps. Comparing ourselves to the state of the art, our experiments show, in practice, a good accuracy and a gain of a factor 2.5 to 8 compared to state of the art solutions.
Cryptography and Security
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of efficiently and accurately computing the Softmax function in the Homomorphic Encryption (HE) environment. Specifically, the Softmax function is used in neural networks to convert outputs into probability distributions, but its computation involves exponential functions and normalization steps, which are very difficult in the HE environment, especially when the dimension of the input vector is large. #### Main challenges: 1. **Numerical stability**: The exponential operation in the Softmax function may lead to numerical overflow or underflow, especially in fixed - point arithmetic. 2. **Difficulty in normalization**: In the HE environment, normalization (i.e., adding all components and taking the inverse) is a very expensive operation, unless there is good control over the range of the normalization factor. 3. **High - dimensional computation**: For applications such as large - language models (LLM), Softmax computations for high - dimensional vectors need to be processed, which poses a huge challenge to existing HE algorithms. #### Solutions: The authors propose a "normalize - and - square" strategy to solve the above challenges through the following methods: - **Step - by - step squaring and normalization**: Decompose the Softmax computation into multiple smaller and more stable steps, thereby avoiding direct computation of exponential functions over a large range and complex normalization operations. - **Iterative algorithm**: Utilize the iterative formula \( \text{Softmax}(x/2^j)_i=\frac{\text{Softmax}(x/2^{j - 1})_i^2}{\sum_{t = 1}^n\text{Softmax}(x/2^{j - 1})_t^2} \), gradually approaching the final result. - **Parallel computation**: Utilize the SIMD characteristics of the CKKS homomorphic encryption system to compute multiple Softmax functions simultaneously in multiple ciphertexts, thereby improving efficiency. #### Experimental results: - **Accuracy**: Experiments show that this algorithm can achieve an accuracy of about 16 bits (in the worst case) and an average accuracy of about 20 bits. - **Efficiency**: Compared with the best existing solutions, this algorithm is 2.5 to 8 times faster in practical applications, especially when multiple Softmax functions need to be computed in parallel. - **Scalability**: This algorithm can handle 8192 Softmax computations with a dimension of 256 on a single - threaded CPU and can handle Softmax computations in large - language models (such as LLaMa) on a GPU, demonstrating good scalability. In conclusion, this paper significantly improves the efficiency and accuracy of Softmax computations in the homomorphic encryption environment through innovative algorithm design, providing strong support for Privacy - Preserving Machine Learning (PPML).