TPU as Cryptographic Accelerator

Rabimba Karanjai,Sangwon Shin,and Wujie Xiong,Xinxin Fan,Lin Chen,Tianwei Zhang,Taeweon Suh,Weidong Shi,Veronika Kuchta,Francesco Sica,Lei Xu
DOI: https://doi.org/10.1145/3696843.3696844
2024-10-03
Abstract:Cryptographic schemes like Fully Homomorphic Encryption (FHE) and Zero-Knowledge Proofs (ZKPs), while offering powerful privacy-preserving capabilities, are often hindered by their computational complexity. Polynomial multiplication, a core operation in these schemes, is a major performance bottleneck. While algorithmic advancements and specialized hardware like GPUs and FPGAs have shown promise in accelerating these computations, the recent surge in AI accelerators (TPUs/NPUs) presents a new opportunity. This paper explores the potential of leveraging TPUs/NPUs to accelerate polynomial multiplication, thereby enhancing the performance of FHE and ZKP schemes. We present techniques to adapt polynomial multiplication to these AI-centric architectures and provide a preliminary evaluation of their effectiveness. We also discuss current limitations and outline future directions for further performance improvements, paving the way for wider adoption of advanced cryptographic tools.
Cryptography and Security
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is: **How to use tensor processing units (TPUs) to accelerate polynomial multiplication operations in cryptographic schemes such as fully homomorphic encryption (FHE) and zero - knowledge proof (ZKP), thereby improving the performance of these schemes**. Specifically: 1. **Problem Background**: - Although cryptographic schemes such as fully homomorphic encryption (FHE) and zero - knowledge proof (ZKP) provide strong privacy protection capabilities, their computational complexity is relatively high, especially polynomial multiplication is the main performance bottleneck in these schemes. - Traditional GPUs and FPGAs have shown certain potential in accelerating these calculations, but artificial intelligence accelerators (such as TPU/NPU) emerging in recent years have provided new opportunities for accelerating polynomial multiplication. 2. **Research Objectives**: - Explore how to use TPU/NPU to accelerate polynomial multiplication in order to improve the performance of FHE and ZKP schemes. - Propose techniques for adapting polynomial multiplication to AI - centered architectures and preliminarily evaluate their effectiveness. - Discuss the current limitations and propose directions for future improvement to further improve performance and promote the wide application of advanced cryptographic tools. 3. **Specific Challenges**: - **Large - coefficient Processing**: Polynomials used in cryptography usually have large coefficients, while TPUs are designed for machine - learning tasks and support data types with lower precision. For this reason, the paper proposes to use the residue number system (RNS) to represent and calculate large coefficients. - **High - degree Polynomial Processing**: The matrix multiplication hardware of TPUs has size limitations and it is difficult to directly handle large matrices corresponding to high - degree polynomials. For this reason, the paper adopts the divide - and - conquer method to decompose high - degree polynomials into multiple low - degree polynomials for processing. 4. **Solutions**: - **RNS Representation and Computation**: Decompose large coefficients into multiple small values through RNS, so that each instance can be executed independently and in parallel without modifying the original framework. - **Divide - and - conquer Method**: Decompose high - degree polynomials into multiple low - degree polynomials and process them recursively until they can meet the size limitations of TPU hardware. 5. **Contributions**: - Design key techniques to enable TPUs to be used for polynomial multiplication in various cryptographic schemes. - Develop a prototype to demonstrate the feasibility of the design. - Discuss potential methods for improving TPU hardware and corresponding algorithms to further accelerate polynomial multiplication. Through these methods, the paper aims to make full use of the matrix multiplication capabilities of TPUs and significantly improve the performance of cryptographic schemes such as FHE and ZKP.