ProgramGalois: A Programmable Generator of Radix-4 Discrete Galois Transformation Architecture for Lattice-based Cryptography

Guangyan Li,Zewen Ye,Donglong Chen,Wangchen Dai,Gaoyu Mao,Kejie Huang,Ray C. C. Cheung
DOI: https://doi.org/10.1145/3689437
IF: 2.837
2024-08-27
ACM Transactions on Reconfigurable Technology and Systems
Abstract:Lattice-based cryptography (LBC) has been established as a prominent research field, with particular attention on post-quantum cryptography (PQC) and fully homomorphic encryption (FHE). As the implementing bottleneck of PQC and FHE, number theoretic transform (NTT) has been extensively studied. However, current works struggled with scalability, hindering their adaptation to various parameters, such as bit-width and polynomial length. In this paper, we proposed a novel Discrete Galois Transformation (DGT) algorithm utilizing the radix-4 variant to achieve a higher level of parallelism to the existing NTT. Furthermore, to implement the efficient radix-4 DGT adapting more LBCs, we proposed a set of scalable building blocks, including a modified Barrett modular multiplier accepting arbitrary modulus with only one integer multiplier, a radix-4 DGT butterfly unit, and a stream permutation network. The proposed modules are implemented on the Xilinx Virtex-7 and U250 FPGA to evaluate resource utilization and performance. Lastly, a design space exploration framework is proposed to generate optimized radix-4 DGT hardware constrained by polynomial and platform parameters. The sensitivity analysis showcases the generated hardware's performance and scalability. The implementation results on the Xilinx Virtex-7 and U250 FPGA show significant performance improvements over the state-of-the-art works, which reached at least 35%, 192%, and 68% area-time product improvements in terms of LUTs, BRAMs, and DSPs, respectively.
computer science, hardware & architecture
What problem does this paper attempt to address?