SFOD: Spiking Fusion Object Detector

Yimeng Fan,Wei Zhang,Changsong Liu,Mingyang Li,Wenrui Lu
2024-03-22
Abstract:Event cameras, characterized by high temporal resolution, high dynamic range, low power consumption, and high pixel bandwidth, offer unique capabilities for object detection in specialized contexts. Despite these advantages, the inherent sparsity and asynchrony of event data pose challenges to existing object detection algorithms. Spiking Neural Networks (SNNs), inspired by the way the human brain codes and processes information, offer a potential solution to these difficulties. However, their performance in object detection using event cameras is limited in current implementations. In this paper, we propose the Spiking Fusion Object Detector (SFOD), a simple and efficient approach to SNN-based object detection. Specifically, we design a Spiking Fusion Module, achieving the first-time fusion of feature maps from different scales in SNNs applied to event cameras. Additionally, through integrating our analysis and experiments conducted during the pretraining of the backbone network on the NCAR dataset, we delve deeply into the impact of spiking decoding strategies and loss functions on model performance. Thereby, we establish state-of-the-art classification results based on SNNs, achieving 93.7\% accuracy on the NCAR dataset. Experimental results on the GEN1 detection dataset demonstrate that the SFOD achieves a state-of-the-art mAP of 32.1\%, outperforming existing SNN-based approaches. Our research not only underscores the potential of SNNs in object detection with event cameras but also propels the advancement of SNNs. Code is available at
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
This paper focuses on the problem of object detection on event cameras. Event cameras can capture changes in brightness with high temporal resolution, high dynamic range, and low power consumption. However, the sparsity and asynchrony of event camera data pose challenges to existing object detection algorithms. To address this problem, the paper proposes a new method called Spiking Fusion Object Detector (SFOD), which combines Spiking Neural Networks (SNNs). SNNs are inspired by the information processing mechanism of the human brain and are suitable for handling sparse event data. However, current SNNs perform poorly in object detection on event cameras, especially in multi-scale feature fusion. SFOD achieves feature fusion in SNNs for the first time by integrating feature maps of different scales to enhance the model's detection capability. In addition, the paper thoroughly investigates the impact of spike decoding strategies and loss functions in SNNs on model performance and finds that the combination of Spiking Rate Decoding and Mean Squared Error (MSE) loss function achieves the best classification performance. On the NCAR dataset, SFOD achieves a classification accuracy of 93.7%, outperforming other SNNs methods. On the GEN1 detection dataset, SFOD achieves an average precision (mAP) of 32.1%, also demonstrating excellent performance. The contributions of the paper include the first implementation of feature fusion in SNNs, comprehensive analysis of spike decoding strategies and loss functions in SNNs, and the proposal of an efficient SNNs object detection model, SFOD. The code has been open-sourced on GitHub.