Real-time traffic sign detection network based on Swin Transformer

Wei Zhu,Yibin Ying,Yayu Zheng,Yikai Chen,Shucheng Huang
DOI: https://doi.org/10.21203/rs.3.rs-3299732/v1
2023-01-01
Abstract:Abstract In the field of autonomous driving, the detection of traffic signs remains a significant challenge, especially when it comes to the real-time detection of medium and small targets. The difficulty of detecting small objects decreases accuracy. To address these challenges, we propose a real-time traffic sign detection algorithm based on the Swin Transformer (RTSDST) that improves computation performance and accuracy for multi-scale target detection on SoCs installed onboard autonomous driving vehicles. Our approach includes a head specifically designed for detecting tiny objects, followed by the adoption of Swin Transformer blocks to effectively capture the spatial and channel dependencies of the feature maps, which improves the accuracy of detecting targets of varying sizes. To efficiently identify regions of interest in large coverage images, we employ a Residual Convolutional Attention Module to generate sequential feature maps between the channel and spatial dimensions and weigh them against the original map. A realistic traffic sign detection dataset, Tsinghua-Tencent 100K (TT100K), which includes medium and small traffic sign targets, was adopted in this article to evaluate the effectiveness of our proposed RTSDST. The evaluation results show that RTSDST has excellent performance on multi-scale scenes. Additionally, we also evaluated our network on the VisDrone dataset for small target detection. Our method has state-of-art performance on small targets.
What problem does this paper attempt to address?