Real-Time Dynamic Scale-Aware Fusion Detection Network: Take Road Damage Detection as an example

Weichao Pan,Xu Wang,Wenqing Huan
2024-09-04
Abstract:Unmanned Aerial Vehicle (UAV)-based Road Damage Detection (RDD) is important for daily maintenance and safety in cities, especially in terms of significantly reducing labor costs. However, current UAV-based RDD research is still faces many challenges. For example, the damage with irregular size and direction, the masking of damage by the background, and the difficulty of distinguishing damage from the background significantly affect the ability of UAV to detect road damage in daily inspection. To solve these problems and improve the performance of UAV in real-time road damage detection, we design and propose three corresponding modules: a feature extraction module that flexibly adapts to shape and background; a module that fuses multiscale perception and adapts to shape and background ; an efficient downsampling module. Based on these modules, we designed a multi-scale, adaptive road damage detection model with the ability to automatically remove background interference, called Dynamic Scale-Aware Fusion Detection Model (RT-DSAFDet). Experimental results on the UAV-PDD2023 public dataset show that our model RT-DSAFDet achieves a mAP50 of 54.2%, which is 11.1% higher than that of YOLOv10-m, an efficient variant of the latest real-time object detection model YOLOv10, while the amount of parameters is reduced to 1.8M and FLOPs to 4.6G, with a decreased by 88% and 93%, respectively. Furthermore, on the large generalized object detection public dataset MS COCO2017 also shows the superiority of our model with mAP50-95 is the same as YOLOv9-t, but with 0.5% higher mAP50, 10% less parameters volume, and 40% less FLOPs.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper aims to address the challenges faced by Unmanned Aerial Vehicles (UAVs) in Road Damage Detection (RDD), particularly in improving performance in real-time detection. The specific issues include: 1. **Irregular Damage Size and Orientation**: The size, shape, and orientation of road damage are irregular, making it difficult for existing UAV detection methods to accurately identify them. 2. **Background Interference**: Complex backgrounds can obscure or confuse road damage, affecting detection accuracy. 3. **Difficulty in Distinguishing Damage from Background**: In routine inspections, distinguishing damage areas from the background is a challenging task. To solve these problems, the authors designed and proposed a new multi-scale adaptive road damage detection model—Dynamic Scale-Aware Fusion Detection Model (RT-DSAFDet). This model has the following features: - **Feature Extraction Module** (Flexible Attention, FA module): It can flexibly adapt to changes in damage shape and background, improving detection stability and accuracy in complex scenes. - **Multi-Scale Perception and Adaptive Module** (Dynamic Scale-Aware Fusion, DSAF module): By fusing multi-scale features and adapting to damages of different shapes and backgrounds, it significantly enhances the model's multi-scale feature extraction and fusion capabilities. - **Efficient Downsampling Module** (Spatial Downsampling, SD module): It greatly reduces the number of model parameters and computational complexity, improving computational efficiency, making it more suitable for real-time detection needs. Experimental results show that RT-DSAFDet achieved an mAP50 of 54.2% on the UAV-PDD2023 public dataset, which is 11.1% higher than YOLOv10-m, with the number of parameters reduced to 1.8M and FLOPs reduced to 4.6G, decreasing by 88% and 93% respectively. Additionally, it also demonstrated its superiority on the large-scale general object detection dataset MS COCO2017, with mAP50-95 comparable to YOLOv9-t, but mAP50 higher by 0.5%, with 10% fewer parameters and 40% fewer FLOPs.