Computing-in-Memory for Performance and Energy-Efficient Homomorphic Encryption
Dayane Reis,Jonathan Takeshita,Taeho Jung,Michael Niemier,Xiaobo Sharon Hu
DOI: https://doi.org/10.1109/tvlsi.2020.3017595
2020-11-01
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Abstract:Homomorphic encryption (HE) allows direct computations on encrypted data. Despite numerous research efforts, the practicality of HE schemes remains to be demonstrated. In this regard, the enormous size of ciphertexts involved in HE computations degrades computational efficiency. Near-memory processing (NMP) and computing-in-memory (CiM)—paradigms where computation is done within the memory boundaries—represent architectural solutions for reducing latency and energy associated with data transfers in data-intensive applications, such as HE. This article introduces CiM-HE, a CiM architecture that can support operations for the Brakerski/Fan–Vercauteren (B/FV) scheme, a somewhat HE scheme for general computation. CiM-HE hardware consists of customized peripherals, such as sense amplifiers, adders, bit shifters, and sequencing circuits. The peripherals are based on CMOS technology and could support computations with memory cells of different technologies. Circuit-level simulations are used to evaluate our CiM-HE framework assuming a 6T-SRAM memory. We compare our CiM-HE implementation against: 1) two optimized CPU HE implementations and 2) a field-programmable gate array (FPGA)-based HE accelerator implementation. Compared with a CPU solution, CiM-HE obtains speedups between $4.6times $ and $9.1times $ and energy savings between $266.4times $ and $532.8times $ for homomorphic multiplications (the most expensive HE operation). Also, a set of four end-to-end tasks, i.e., mean, variance, linear regression, and inference, are up to $1.1times $ , $7.7times $ , $7.1times $ , and $7.5times $ faster (and $301.1times $ , $404.6times $ , $532.3times $ , and $532.8times $ more energy efficient). Compared with CPU-based HE in previous work, CiM-HE obtains $14.3times $ speedup and $> 2600times $ energy savings. Finally, our design offers $2.2times $ speedup with $88.1times $ energy savings compared with a state-of-the-art FPGA-based accelerator.
engineering, electrical & electronic,computer science, hardware & architecture