FlexibleCP: A data augmentation strategy for traffic sign detection

Jingyi Shi,Huanle Rao,Qinyang Jing,Ziqiang Wen,Gangyong Jia
DOI: https://doi.org/10.1049/ipr2.13204
IF: 2.3
2024-09-20
IET Image Processing
Abstract:This study proposes an innovative data enhancement method: the flexible cut and paste (FlexibleCP) strategy. It directs the model to focus on the target by appropriately increasing the proportion of the target in the data to train a more robust traffic sign detection model. FlexibleCP provides extensive customization for dataset augmentation, which improves the model's accuracy, robustness, and generalization ability against category and size imbalance problems. Experimental results demonstrate its effectiveness, making FlexibleCP an important addition to the field of traffic sign detection. In the field of traffic sign detection, effective data augmentation can improve the model's detection capacity, enabling the model to distinguish and locate traffic signs more precisely and enhancing driving safety. However, due to the small size and low representation of traffic signs in the dataset, standard common data augmentation techniques are not suitable for traffic sign detection. To address this issue, a novel data augmentation strategy called flexible cut and paste (FlexibleCP) is proposed. The overall enhancement approach is shifted from multi‐image fusion to target cropping and pasting. By introducing parameters to control the target pasting ratio and scaling ratio, the diversity of small target data and their size variations are enriched. Additionally, target size and type filters are added to enable targeted enhancement for different sizes and types of targets. This study, evaluates the proposed strategy using two representative traffic sign detection datasets, namely CTSD and GTSDB. The experimental results demonstrate a significant improvement in both detection and recognition performance of the model: on the CTSD dataset, the models trained with FlexibleCP data enhancement achieve 88.9% and 64.5% mAP0.5 and mAP0.5:0.95, respectively, which are 3.5% and 2.5% better than those trained with mosaic data enhancement; on the GTSDB dataset mAP0.5 and mAP0.5:0.95 reached 89.2% and 56.0%, respectively, an improvement of 4.0% and 3.9% over mosaic.
computer science, artificial intelligence,engineering, electrical & electronic,imaging science & photographic technology
What problem does this paper attempt to address?