Abstract:Deep convolutional neural networks (DCNNs) have achieved state-of-the-art performance in classification, natural language processing (NLP), and regression tasks. However, there is still a great gap between DCNNs and the human brain in terms of computation efficiency. Inspired by neural synaptic plasticity and stochastic computing (SC), we propose neural synaptic plasticity-inspired computing (NSPC) to simulate the human brain's neural network activity for inference tasks with simple logic gates. The multiplication and accumulation (MAC) is transformed by the wire connectivity in NSPC, which only requires bundles of wires and small width adders. To this end, the NSPC imitates the structure of neural synaptic plasticity from a circuit wires connection perspective. Furthermore, from the principle of NSPC, we use a data mapping method to convert the convolution operations to matrix multiplications. Based on the methodology of NSPC, fully-pipelined and low latency architecture is designed. The proposed NSPC accelerator exhibits high hardware efficiency while maintaining a comparable network accuracy level. The NSPC based DCNN accelerator (NSPC-CNN) processes DCNN at $1.5625M$ images/ $s$ with a power dissipation of $15.42~W$ and an area of $36.4~mm^{2}$ . The NSPC based deep neural network (DNN) accelerator (NSPC-DNN) that implements three fully connected layers DNN consumes only $6.6~mm^{2}$ area and $2.93~W$ power, and achieves a throughput of $400M$ -images/ $s$ . Compared with conventional fixed-point implementations, the NSPC-CNN achieves $2.77 times $ area efficiency, $2.25 times $ power efficiency; the proposed NSPC-DNN exhibits $2.31 times $ area efficiency and $2.09 times $ power efficiency.

An Energy-Efficient Mixed-Bitwidth Systolic Accelerator for NAS-Optimized Deep Neural Networks

An Energy-Efficient Bit-Split-and-Combination Systolic Accelerator for NAS-Based Multi-Precision Convolution Neural Networks

A Precision-Scalable Energy-Efficient Bit-Split-and-Combination Vector Systolic Accelerator for NAS-Optimized DNNs on Edge

A High Performance Multi-Bit-Width Booth Vector Systolic Accelerator for NAS Optimized Deep Learning Neural Networks

A Convolutional Neural Network Accelerator Architecture with Fine-Granular Mixed Precision Configurability.

A fine-grained mixed precision DNN accelerator using a two-stage big-little core RISC-V MCU.

An Energy-Efficient Accelerator for Hybrid Bit-Width DNNs

A Low-Power Sparse Convolutional Neural Network Accelerator with Pre-Encoding Radix-4 Booth Multiplier

Low-Complexity Precision-Scalable Multiply-Accumulate Unit Architectures for Deep Neural Network Accelerators

Bit-Offsetter: A Bit-serial DNN Accelerator with Weight-offset MAC for Bit-wise Sparsity Exploitation

An Energy-Efficient Mixed-Bit CNN Accelerator With Column Parallel Readout for ReRAM-Based In-Memory Computing

An Energy-Efficient Mixed-Bit ReRAM-based Computing-in-Memory CNN Accelerator with Fully Parallel Readout.

Addressing the issue of processing element under-utilization in general-purpose systolic deep learning accelerators

EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration

An Energy-Efficient Mixed-Signal Parallel Multiply-Accumulate (MAC) Engine Based on Stochastic Computing

An Energy-Efficient Time-Domain Binary Neural Network Accelerator with Error-Detection in 28nm CMOS

Neural Synaptic Plasticity-Inspired Computing: A High Computing Efficient Deep Convolutional Neural Network Accelerator

Exploiting Dynamic Bit Sparsity in Activation for Deep Neural Network Acceleration.

Hybrid Stochastic-Binary Computing for Low-Latency and High-Precision Inference of CNNs

Exploiting Bit Sparsity in Both Activation and Weight in Neural Networks Accelerators

Leveraging Bit-Serial Architectures for Hardware-Oriented Deep Learning Accelerators with Column-Buffering Dataflow