YOLO-U: multi-task model for vehicle detection and road segmentation in UAV aerial imagery

He, Peng
DOI: https://doi.org/10.1007/s12145-024-01335-1
2024-06-05
Earth Science Informatics
Abstract:Due to the constrained performance of embedded chips in devices such as drones, real-time processing of simultaneous vehicle detection and road segmentation networks becomes challenging, leading to a lack of associative feature learning. To tackle these issues, we introduce a novel multi-task model for vehicle detection and road segmentation in unmanned aerial vehicle(UAV) Aerial Imagery. Our approach introduces a lightweight Ghost-Dilated convolution, combining the large receptive field of dilated convolution with the efficiency of Ghost convolution, resulting in fewer parameters and reduced computational load. Building upon this, we propose the Ghost-Atrous Spatial Pyramid Pooling (G-ASPP) module, a multi-scale feature extraction module that enhances the model's multi-scale characteristics while minimizing the increase in network parameters and computational requirements associated with Atrous Spatial Pyramid Pooling(ASPP) modules. The constructed multi-task UAV aerial vehicle detection and road segmentation network incorporates a carefully designed backbone, neck, detection head, and segmentation head. By refining existing lightweight backbone networks, our model achieves superior real-time performance and accuracy, demonstrating enhanced detection and segmentation accuracy with lower parameters and computational overhead. Experimental validation on a self-constructed multi-task dataset highlights the proposed model's improved segmentation and detection performance, particularly for small targets and narrow roads, confirming its effectiveness. This research contributes valuable insights to the study of multi-task networks in the realm of UAV vision.
geosciences, multidisciplinary,computer science, interdisciplinary applications
What problem does this paper attempt to address?