Abstract:Event cameras are bio-inspired sensors that respond to local changes in light intensity and feature low latency, high energy efficiency, and high dynamic range. Meanwhile, Spiking Neural Networks (SNNs) have gained significant attention due to their remarkable efficiency and fault tolerance. By synergistically harnessing the energy efficiency inherent in event cameras and the spike-based processing capabilities of SNNs, their integration could enable ultra-low-power application scenarios, such as action recognition tasks. However, existing approaches often entail converting asynchronous events into conventional frames, leading to additional data mapping efforts and a loss of sparsity, contradicting the design concept of SNNs and event cameras. To address this challenge, we propose SpikePoint, a novel end-to-end point-based SNN architecture. SpikePoint excels at processing sparse event cloud data, effectively extracting both global and local features through a singular-stage structure. Leveraging the surrogate training method, SpikePoint achieves high accuracy with few parameters and maintains low power consumption, specifically employing the identity mapping feature extractor on diverse datasets. SpikePoint achieves state-of-the-art (SOTA) performance on four event-based action recognition datasets using only 16 timesteps, surpassing other SNN methods. Moreover, it also achieves SOTA performance across all methods on three datasets, utilizing approximately 0.3\% of the parameters and 0.5\% of power consumption employed by artificial neural networks (ANNs). These results emphasize the significance of Point Cloud and pave the way for many ultra-low-power event-based data processing applications.

SpikingViT: a Multi-scale Spiking Vision Transformer Model for Event-based Object Detection

Spiking Transformers for Event-based Single Object Tracking

Event-based Action Recognition Using Motion Information and Spiking Neural Networks

Training Robust Spiking Neural Networks with ViewPoint Transform and SpatioTemporal Stretching

Hierarchical Spiking-Based Model for Efficient Image Classification with Enhanced Feature Extraction and Encoding.

A dynamic vision sensor object recognition model based on trainable event-driven convolution and spiking attention mechanism

A Novel Spike Transformer Network for Depth Estimation from Event Cameras via Cross-modality Knowledge Distillation

An Event-based Feature Representation Method for Event Stream Classification Using Deep Spiking Neural Networks

Spiking Neural Network Recognition Method Based on Dynamic Visual Motion Features

CSNN: an Augmented Spiking Based Framework with Perceptron-Inception

Deep CovDenseSNN: A Hierarchical Event-Driven Dynamic Framework with Spiking Neurons in Noisy Environment

Spiking Neural Networks with Dynamic Time Steps for Vision Transformers

Sparser spiking activity can be better: Feature Refine-and-Mask spiking neural network for event-based visual recognition

Spike-HAR++: an Energy-Efficient and Lightweight Parallel Spiking Transformer for Event-Based Human Action Recognition

Spiking Neural Network as Adaptive Event Stream Slicer

SFOD: Spiking Fusion Object Detector

DS2TA: Denoising Spiking Transformer with Attenuated Spatiotemporal Attention

SpikePoint: An Efficient Point-based Spiking Neural Network for Event Cameras Action Recognition

Distractor-Aware Event-Based Tracking

An Event Coding Method Based on Frame Images with Dynamic Vision Sensor Modeling

Training Robust Spiking Neural Networks on Neuromorphic Data with Spatiotemporal Fragments