SUSTechGAN: Image Generation for Object Detection in Adverse Conditions of Autonomous Driving

Gongjin Lan,Yang Peng,Qi Hao,Chengzhong Xu
2024-12-21
Abstract:Autonomous driving significantly benefits from data-driven deep neural networks. However, the data in autonomous driving typically fits the long-tailed distribution, in which the critical driving data in adverse conditions is hard to collect. Although generative adversarial networks (GANs) have been applied to augment data for autonomous driving, generating driving images in adverse conditions is still challenging. In this work, we propose a novel framework, SUSTechGAN, with customized dual attention modules, multi-scale generators, and a novel loss function to generate driving images for improving object detection of autonomous driving in adverse conditions. We test the SUSTechGAN and the well-known GANs to generate driving images in adverse conditions of rain and night and apply the generated images to retrain object detection networks. Specifically, we add generated images into the training datasets to retrain the well-known YOLOv5 and evaluate the improvement of the retrained YOLOv5 for object detection in adverse conditions. The experimental results show that the generated driving images by our SUSTechGAN significantly improved the performance of retrained YOLOv5 in rain and night conditions, which outperforms the well-known GANs. The open-source code, video description and datasets are available on the page 1 to facilitate image generation development in autonomous driving under adverse conditions.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to generate high - quality driving images in adverse conditions (such as rainy days and at night) to improve the effect of object detection in autonomous driving. Specifically, the paper addresses the following issues: 1. **Imbalanced data distribution**: Data in autonomous driving usually follows a long - tailed distribution, and it is difficult to collect crucial driving data (especially data in adverse conditions). 2. **Limitations of existing GAN methods**: - **Weak local semantic features**: In the generated images, key objects such as vehicles and traffic signs are often blurry or even disappear, and cannot effectively improve object detection. - **Weak global semantic features**: Since the input image is cropped and adjusted to a small size (such as 360×360), it is difficult to capture the global features in a large - size image (such as 1980×1080). - **Lack of detection loss**: Existing GAN methods mainly consider adversarial loss and cycle - consistency loss, without a loss function specifically for object detection. To address these problems, the authors propose a new framework named SUSTechGAN. By introducing a customized dual - attention module, a multi - scale generator, and a new loss function, it generates high - quality driving images, thereby improving object detection performance in adverse conditions. ### Main contributions of SUSTechGAN 1. **Dual - attention module**: The position - attention module (PAM) and the channel - attention module (CAM) are designed to improve the extraction of local semantic features in the generated images, especially for key objects such as vehicles in rainy and night conditions. 2. **Multi - scale generator**: Consider features at different scales (such as a large - size generator for global features and a small - size generator for local features) to generate high - quality images and ensure clear global and local semantic features. 3. **New loss function**: A new loss function that includes an additional detection loss is proposed to guide image generation and improve object detection performance in adverse conditions. Through these improvements, SUSTechGAN can better preserve the details of key objects when generating images and significantly improve the performance of object detection networks such as YOLOv5 in rainy and night conditions.