Abstract:In the past decade, advances in Artificial Neural Networks (ANNs) have allowed them to perform extremely well for a wide range of tasks. In fact, they have reached human parity when performing image recognition, for example. Unfortunately, the accuracy of these ANNs comes at the expense of a large number of cache and/or memory accesses and compute operations. Spiking Neural Networks (SNNs), a type of neuromorphic, or brain-inspired network, have recently gained significant interest as power-efficient alternatives to ANNs, because they are sparse, accessing very few weights, and typically only use addition operations instead of the more power-intensive multiply-and-accumulate (MAC) operations. The vast majority of neuromorphic hardware designs support rate-encoded SNNs, where the information is encoded in spike rates. Rate-encoded SNNs could be seen as inefficient as an encoding scheme because it involves the transmission of a large number of spikes. A more efficient encoding scheme, Time-To-First-Spike (TTFS) encoding, encodes information in the relative time of arrival of spikes. While TTFS-encoded SNNs are more efficient than rate-encoded SNNs, they have, up to now, performed poorly in terms of accuracy compared to previous methods. Hence, in this work, we aim to overcome the limitations of TTFS-encoded neuromorphic systems. To accomplish this, we propose: (1) a novel optimization algorithm for TTFS-encoded SNNs converted from ANNs and (2) a novel hardware accelerator for TTFS-encoded SNNs, with a scalable and low-power design. Overall, our work in TTFS encoding and training improves the accuracy of SNNs to achieve state-of-the-art results on MNIST MLPs, while reducing power consumption by 1.46$\times$ over the state-of-the-art neuromorphic hardware.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are how to improve the energy efficiency and accuracy of Spiking Neural Networks (SNNs) based on Time - to - First - Spike (TTFS) encoding, making them reach the precision comparable to that of Artificial Neural Networks (ANNs). Specifically: 1. **Trade - off between the efficiency and accuracy of TTFS encoding**: Although TTFS encoding has higher energy efficiency compared to the traditional rate - encoded method because it only needs one spike to transmit information, its classification accuracy is usually lower. Therefore, the author hopes to find a method to improve the accuracy of SNNs with TTFS encoding while maintaining the advantage of high energy efficiency. 2. **Design of hardware accelerators**: Existing hardware accelerators (such as IBM's TrueNorth) perform well in implementing rate - based SNNs, but fail to fully utilize the sparsity when dealing with time - based SNNs. Therefore, the author proposes a new hardware accelerator design to better support SNNs with TTFS encoding and significantly reduce power consumption. To solve these problems, the author makes the following two main contributions: 1. **Optimization algorithm**: A new training algorithm has been developed to convert pre - trained ANNs into SNNs with TTFS encoding, and reduce the accumulated errors during the conversion process by fine - tuning the network weights. This enables SNNs with TTFS encoding to approach the accuracy of ANNs (with only a 0.2% difference), thus making them suitable for tasks of traditional ANNs. 2. **New - type hardware accelerator**: A new - type hardware accelerator named You Only Spike Once (YOSO) has been designed, which is specifically optimized for SNNs with TTFS encoding. By taking advantage of the sparsity of SNNs, this accelerator significantly reduces the number of memory accesses, thereby improving energy efficiency. Through these improvements, the author shows that SNNs with TTFS encoding achieve state - of - the - art results in the Multi - Layer Perceptron (MLP) tasks on the MNIST dataset, while the power consumption is 1.46 times lower than that of the existing state - of - the - art neuromorphic hardware. ### Formula summary In SNNs with TTFS encoding, the dynamics of the membrane potential can be expressed as: \[ \frac{dV_i^{\text{mem}}(t)}{dt}=\sum_{j\in\Gamma_i^-}w_{ij}[t - t_j]+b_i \] where: - $ V_i^{\text{mem}} $ is the membrane potential of neuron $ i $, - $ w_{ij} $ is the weight of the synaptic connection from $ j $ to $ i $, - $ t_j $ is the time of the first spike of the presynaptic neuron $ j $, - $ \Gamma_i^- $ is the set of all presynaptic neurons that generate spikes before $ t_i $, - $ b_i $ is the bias of neuron $ i $. To determine the first spike time $ t_i $ of each neuron $ i $, the membrane potential can be set equal to the threshold $ \theta $: \[ \theta=\sum_{j\in\Gamma_i^-}w_{ij}[t_i - t_j]+b_it_i \] After rearranging the terms, the first spike time $ t_i $ can be expressed as: \[ t_i=\frac{1}{\mu_i}\left(\theta+\sum_{j\in\Gamma_i^-}w_{ij}t_j\right) \] where: \[ \mu_i=\sum_{j\in\Gamma_i^-}w_{ij}+b_i \] Furthermore, the instantaneous firing rate $ r_i $

You Only Spike Once: Improving Energy-Efficient Neuromorphic Inference to ANN-Level Accuracy

A TTFS-based energy and utilization efficient neuromorphic CNN accelerator

PT-Spike: A Precise-Time-Dependent Single Spike Neuromorphic Architecture with Efficient Supervised Learning

Spike Trains Encoding and Threshold Rescaling Method for Deep Spiking Neural Networks

Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training

A Time-to-first-spike Coding and Conversion Aware Training for Energy-Efficient Deep Spiking Neural Network Processor Design

Reconsidering the energy efficiency of spiking neural networks

Boosting Throughput and Efficiency of Hardware Spiking Neural Accelerators using Time Compression Supporting Multiple Spike Codes

30.2 A 22nm 0.26nW/Synapse Spike-Driven Spiking Neural Network Processing Unit Using Time-Step-First Dataflow and Sparsity-Adaptive In-Memory Computing

MF-DSNN:An Energy-efficient High-performance Multiplication-free Deep Spiking Neural Network Accelerator

Recent Advances in Scalable Energy-Efficient and Trustworthy Spiking Neural networks: from Algorithms to Technology

FSpiNN: An Optimization Framework for Memory- and Energy-Efficient Spiking Neural Networks

An Energy Efficient Residual Spiking Neural Network Accelerator with Ternary Spikes

To Spike or Not To Spike: A Digital Hardware Perspective on Deep Learning Acceleration

A 28-Nm 0.34-Pj/sop Spike-Based Neuromorphic Processor for Efficient Artificial Neural Network Implementations

NBSSN: A Neuromorphic Binary Single-Spike Neural Network for Efficient Edge Intelligence.

Benchmarking Artificial Neural Network Architectures for High-Performance Spiking Neural Networks

SpikeSim: An end-to-end Compute-in-Memory Hardware Evaluation Tool for Benchmarking Spiking Neural Networks

Exploiting Temporal-Unrolled Parallelism for Energy-Efficient SNN Acceleration

Enabling Efficient On-Edge Spiking Neural Network Acceleration with Highly Flexible FPGA Architectures

A 0.67-to-5.4 TSOPs/W Spiking Neural Network Accelerator With 128/256 Reconfigurable Neurons and Asynchronous Fully Connected Synapses