Abstract:Edge computing is a promising solution for handling high-dimensional, multispectral analog data from sensors and IoT devices for applications such as autonomous drones. However, edge devices' limited storage and computing resources make it challenging to perform complex predictive modeling at the edge. Compute-in-memory (CiM) has emerged as a principal paradigm to minimize energy for deep learning-based inference at the edge. Nevertheless, integrating storage and processing complicates memory cells and/or memory peripherals, essentially trading off area efficiency for energy efficiency. This paper proposes a novel solution to improve area efficiency in deep learning inference tasks. The proposed method employs two key strategies. Firstly, a Frequency domain learning approach uses binarized Walsh-Hadamard Transforms, reducing the necessary parameters for DNN (by 87% in MobileNetV2) and enabling compute-in-SRAM, which better utilizes parallelism during inference. Secondly, a memory-immersed collaborative digitization method is described among CiM arrays to reduce the area overheads of conventional ADCs. This facilitates more CiM arrays in limited footprint designs, leading to better parallelism and reduced external memory accesses. Different networking configurations are explored, where Flash, SA, and their hybrid digitization steps can be implemented using the memory-immersed scheme. The results are demonstrated using a 65 nm CMOS test chip, exhibiting significant area and energy savings compared to a 40 nm-node 5-bit SAR ADC and 5-bit Flash ADC. By processing analog data more efficiently, it is possible to selectively retain valuable data from sensors and alleviate the challenges posed by the analog data deluge.

Design of Computing-in-Memory (CIM) with Vertical Split-Gate Flash Memory for Deep Neural Network (DNN) Inference Accelerator

A Robust 8-Bit Non-Volatile Computing-in-Memory Core for Low-Power Parallel MAC Operations.

Neural Network Acceleration and Voice Recognition with a Flash-based In-Memory Computing SoC

An Approach of 3D NAND Flash Based Nonvolatile Computing-In-Memory (nvCIM) Accelerator for Deep Neural Networks (DNNs) with Calibration and Read Disturb Analysis

Benchmark of the Compute-in-Memory-Based DNN Accelerator With Area Constraint

Simulation of a Fully Digital Computing-in-Memory for Non-Volatile Memory for Artificial Intelligence Edge Applications

Flash Memory Array for Efficient Implementation of Deep Neural Networks

On Designing Efficient and Reliable Nonvolatile Memory-Based Computing-In-Memory Accelerators

A Heterogeneous Microprocessor for Intermittent AI Inference Using Nonvolatile-SRAM-based Compute-In-Memory

Containing Analog Data Deluge at Edge through Frequency-Domain Compression in Collaborative Compute-in-Memory Networks

An Emerging NVM CIM Accelerator with Shared-Path Transpose Read and Bit-Interleaving Weight Storage for Efficient On-Chip Training in Edge Devices

A Digital SRAM Computing-in-Memory Design Utilizing Activation Unstructured Sparsity for High-Efficient DNN Inference

CLEAR: a Full-Stack Chip-in-loop Emulator for Analog RRAM Based Computing-in-memory System

Energy-efficient SNN Architecture using 3nm FinFET Multiport SRAM-based CIM with Online Learning

A Reconfigurable Computing-in-Memory Accelerator with Dynamic Group-Based Dataflow and Dual-Input Macro Designs

A Non-Volatile Computing-In-Memory Framework with Margin Enhancement Based CSA and Offset Reduction Based ADC.

A 2.75-to-75.9tops/w Computing-in-Memory NN Processor Supporting Set-Associate Block-Wise Zero Skipping and Ping-Pong CIM with Simultaneous Computation and Weight Updating.

A 28-Nm 36 Kb SRAM CIM Engine with 0.173 $\mu $m$^{2}$ 4T1T Cell and Self-Load-0 Weight Update for AI Inference and Training Applications

An Edram Based Computing-in-Memory Macro with Full-Valid-Storage and Channel-Wise-Parallelism for Depthwise Neural Network

TensorCIM: Digital Computing-in-Memory Tensor Processor with Multichip-Module-Based Architecture for Beyond-NN Acceleration

Cambricon-M: A Fibonacci-Coded Charge-Domain SRAM-Based CIM Accelerator for DNN Inference