Abstract:Deep convolutional neural networks (DCNNs) have achieved state-of-the-art performance in classification, natural language processing (NLP), and regression tasks. However, there is still a great gap between DCNNs and the human brain in terms of computation efficiency. Inspired by neural synaptic plasticity and stochastic computing (SC), we propose neural synaptic plasticity-inspired computing (NSPC) to simulate the human brain's neural network activity for inference tasks with simple logic gates. The multiplication and accumulation (MAC) is transformed by the wire connectivity in NSPC, which only requires bundles of wires and small width adders. To this end, the NSPC imitates the structure of neural synaptic plasticity from a circuit wires connection perspective. Furthermore, from the principle of NSPC, we use a data mapping method to convert the convolution operations to matrix multiplications. Based on the methodology of NSPC, fully-pipelined and low latency architecture is designed. The proposed NSPC accelerator exhibits high hardware efficiency while maintaining a comparable network accuracy level. The NSPC based DCNN accelerator (NSPC-CNN) processes DCNN at $1.5625M$ images/ $s$ with a power dissipation of $15.42~W$ and an area of $36.4~mm^{2}$ . The NSPC based deep neural network (DNN) accelerator (NSPC-DNN) that implements three fully connected layers DNN consumes only $6.6~mm^{2}$ area and $2.93~W$ power, and achieves a throughput of $400M$ -images/ $s$ . Compared with conventional fixed-point implementations, the NSPC-CNN achieves $2.77 times $ area efficiency, $2.25 times $ power efficiency; the proposed NSPC-DNN exhibits $2.31 times $ area efficiency and $2.09 times $ power efficiency.

High-Precision Method and Architecture for Base-2 Softmax Function in DNN Training.

Base-2 Softmax Function: Suitability for Training and Efficient Hardware Implementation

Hardware-Efficient SoftMax Architecture With Bit-Wise Exponentiation and Reciprocal Calculation

Efficient Hardware Architecture of Softmax Layer in Deep Neural Network

DaDianNao: A Machine-Learning Supercomputer

A fine-grained mixed precision DNN accelerator using a two-stage big-little core RISC-V MCU.

Efficient FPGA Implementation of softmax Layer in Deep Neural Network

A High Speed SoftMax VLSI Architecture Based on Basic-Split

2 β-softmax: A Hardware-Friendly Activation Function with Low Complexity and High Performance

Low Error-Rate Approximate Multiplier Design for DNNs with Hardware-Driven Co-Optimization

An Efficient Hardware Architecture with Adjustable Precision and Extensible Range to Implement Sigmoid and Tanh Functions

A CORDIC-Based Architecture with Adjustable Precision and Flexible Scalability to Implement Sigmoid and Tanh Functions.

Neural Synaptic Plasticity-Inspired Computing: A High Computing Efficient Deep Convolutional Neural Network Accelerator

Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

Design Space Exploration of Neural Network Activation Function Circuits

Optimization of Softmax Layer in Deep Neural Network Using Integral Stochastic Computation

A Low-Power Arithmetic Element for Multi-Base Logarithmic Computation on Deep Neural Networks

HW/SW Codesign for Robust and Efficient Binarized SNNs by Capacitor Minimization

Hardware Platform-Aware Binarized Neural Network Model Optimization

A Weight-Adjustable Hardware Accelerator Board for Dtcnn Implementation and Application

Base-Reconfigurable Segmented Logarithmic Quantization and Hardware Design for Deep Neural Networks