DS MYOLO: A Reliable Object Detector Based on SSMs for Driving Scenarios

Yang Li,Jianli Xiao
2024-09-02
Abstract:Accurate real-time object detection enhances the safety of advanced driver-assistance systems, making it an essential component in driving scenarios. With the rapid development of deep learning technology, CNN-based YOLO real-time object detectors have gained significant attention. However, the local focus of CNNs results in performance bottlenecks. To further enhance detector performance, researchers have introduced Transformer-based self-attention mechanisms to leverage global receptive fields, but their quadratic complexity incurs substantial computational costs. Recently, Mamba, with its linear complexity, has made significant progress through global selective scanning. Inspired by Mamba's outstanding performance, we propose a novel object detector: DS MYOLO. This detector captures global feature information through a simplified selective scanning fusion block (SimVSS Block) and effectively integrates the network's deep features. Additionally, we introduce an efficient channel attention convolution (ECAConv) that enhances cross-channel feature interaction while maintaining low computational complexity. Extensive experiments on the CCTSDB 2021 and VLD-45 driving scenarios datasets demonstrate that DS MYOLO exhibits significant potential and competitive advantage among similarly scaled YOLO series real-time object detectors.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to address the issue of real-time object detection in autonomous driving scenarios. Specifically, the authors propose a new high-precision real-time object detector called DS MYOLO, with the primary goal of improving object detection accuracy while ensuring real-time performance. #### The main contributions include: 1. **Simplified Selective Scan Fusion Block (SimVSS Block)**: Achieves deep global feature fusion through a simplified selective scan fusion block, effectively integrating the deep features of the network. 2. **Efficient Channel Attention Convolution (ECAConv)**: Introduces an efficient channel attention convolution mechanism to enhance cross-channel feature interaction while maintaining low computational complexity. 3. **DS MYOLO Models of Different Scales**: Designs DS MYOLO real-time object detectors of different scales, validated on public datasets CCTSDB 2021 and VLD-45, demonstrating competitive advantages over existing real-time detectors. Through these improvements, DS MYOLO achieves a better balance between real-time performance and detection accuracy, particularly excelling in tasks such as traffic sign and vehicle identification detection in autonomous driving scenarios.