A Scalable 2T-1Fefet Based Content Addressable Memory Design for Energy Efficient Data Search

Jiahao Cai,Hamza E. Barkam,Mohsen Imani,Kai Ni,Grace Li Zhang,Bing Li,Ulf Schlichtmann,Cheng Zhuo,Xunzhao Yin
DOI: https://doi.org/10.1109/tcad.2024.3493000
IF: 2.9
2024-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:Content Addressable Memory (CAM) is widely used in advanced machine learning models and data-intensive applications for associative search tasks, thanks to the highly parallel pattern matching capability. Most state-of-the-art CAM designs primarily aim to reduce the CAM cell area by utilizing nonvolatile memories (NVMs). However, there has been limited research on optimizing the design and energy efficiency of NVM based CAMs for practical deployment in edge devices and AI hardware. This paper introduces a general compact and energy efficient CAM design scheme that minimizes design overhead by using only one NVM device per cell. Our proposed CAM design realizes both binary CAM (BCAM) and multi-bit CAM (MCAM) by leveraging the binary and multi-level storage property of NVM devices without additional cell overheads. Additionally, we propose an adaptive matchline (ML) precharge and discharge scheme to further optimize search energy by significantly reducing the ML voltage swing. Ferroelectric field-effect transistors (FeFETs) serve as representative NVMs in our proposed design, and we present a 2T-1FeFET CAM array incorporating a sense amplifier that implements the proposed ML scheme. Evaluation results show that our proposed 2T-1FeFET BCAM design achieves energy efficiency improvements of 6.64×/4.74×/9.14×/3.02× compared to CMOS/ReRAM/STT-MRAM/2FeFET BCAM arrays, while 2T-1FeFET MCAM design achieves 8.25×/5.68×/56.35× better energy efficiency compared to ReRAM/3T-1FeFET/1FeFET-1R MACM arrays. Benchmarking results demonstrate that our BCAM/MCAM approach provides 3.2×/3.7× and 2.0×/2.2× energy-delay product improvement over the 2T-2R and 2FeFET CAM in accelerating query processing applications.
What problem does this paper attempt to address?