Abstract:In this work, we propose a novel approach, namely WeatherDG, that can generate realistic, weather-diverse, and driving-screen images based on the cooperation of two foundation models, i.e, Stable Diffusion (SD) and Large Language Model (LLM). Specifically, we first fine-tune the SD with source data, aligning the content and layout of generated samples with real-world driving scenarios. Then, we propose a procedural prompt generation method based on LLM, which can enrich scenario descriptions and help SD automatically generate more diverse, detailed images. In addition, we introduce a balanced generation strategy, which encourages the SD to generate high-quality objects of tailed classes under various weather conditions, such as riders and motorcycles. This segmentation-model-agnostic method can improve the generalization ability of existing models by additionally adapting them with the generated synthetic data. Experiments on three challenging datasets show that our method can significantly improve the segmentation performance of different state-of-the-art models on target domains. Notably, in the setting of ''Cityscapes to ACDC'', our method improves the baseline HRDA by 13.9% in mIoU.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the **Domain Generalization (DG) problem**, especially in semantic segmentation tasks **under severe weather conditions**. Specifically, the authors propose a new method named **WeatherDG** to generate realistic, diverse images that are in line with driving scenarios, in order to improve the generalization ability of the model in unseen domains. #### Background and problem description 1. **Domain Shift Problem**: - In the field of autonomous driving, the performance of existing semantic segmentation models will decline significantly when deployed in unseen domains due to the domain shift problem. This problem is more serious especially under severe weather conditions (such as foggy, rainy, snowy days and at night). - Although collecting more diverse training data is a solution, annotating segmentation data is very time - consuming, so domain generalization has become a popular method to solve the domain shift problem. 2. **Limitations of existing methods**: - Existing domain generalization methods are mainly divided into two categories: Normalization and Data Augmentation. Among them, the data augmentation method is more flexible, can be combined with different model structures, and is easy to be integrated with other techniques. - Although some generative models (such as Stable Diffusion, SD) can generate realistic and diverse images, the images generated by directly applying these models may have inconsistent styles and layouts in driving scenarios, resulting in a decline in model performance. #### Solution To solve the above problems, the authors propose the WeatherDG method. Its core idea is to generate realistic, diverse images that are in line with driving scenarios through the following steps: 1. **SD Fine - tuning**: - Use the source data to fine - tune the Stable Diffusion model, so that the content and layout of the generated images are aligned with the real - world driving scenarios. 2. **Procedural Prompt Generation**: - Based on the large - language model (LLM), propose a procedural prompt generation method to enrich the scene description and help Stable Diffusion automatically generate more diverse and detailed images. - Introduce a balanced generation strategy to encourage the generation of high - quality objects in small - category (such as riders and motorcycles). 3. **Sample Generation and Model Training**: - Use the fine - tuned Stable Diffusion and the generated prompts to generate new diverse samples, and use these samples together with the source data for model training. - Use the unsupervised domain adaptation (UDA) technology to further improve the performance of the model on the target domain. Through these steps, the WeatherDG method can significantly improve the generalization ability of semantic segmentation models under various severe weather conditions. The experimental results show that on datasets from Cityscapes to ACDC, etc., the WeatherDG method improves the mIoU score by 13.9% compared with the baseline model (such as HRDA). ### Summary This paper proposes a novel data - enhancement framework, WeatherDG, by combining Stable Diffusion and large - language models, which solves the domain generalization problem and performs excellently especially in semantic segmentation tasks under severe weather conditions.

WeatherDG: LLM-assisted Diffusion Model for Procedural Weather Generation in Domain-Generalized Semantic Segmentation

ControlUDA: Controllable Diffusion-assisted Unsupervised Domain Adaptation for Cross-Weather Semantic Segmentation

WeatherProof: Leveraging Language Guidance for Semantic Segmentation in Adverse Weather

Weather-aware autopilot: Domain generalization for point cloud semantic segmentation in diverse weather scenarios

DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control

Generation of Multi-Spectral Scene Images under Different Weather Conditions

UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather

SemiDDM-Weather: A Semi-supervised Learning Framework for All-in-one Adverse Weather Removal

Weather GAN: Multi-Domain Weather Translation Using Generative Adversarial Networks

WeatherProof: A Paired-Dataset Approach to Semantic Segmentation in Adverse Weather

A Two-Stage Adverse Weather Semantic Segmentation Method for WeatherProof Challenge CVPR 2024 Workshop UG2+

Deterministic Guidance Diffusion Model for Probabilistic Weather Forecasting

Mastering adverse weather: a two-stage approach for robust semantic segmentation in autonomous driving

WM-MoE: Weather-aware Multi-scale Mixture-of-Experts for Blind Adverse Weather Removal

V2X-DGW: Domain Generalization for Multi-agent Perception under Adverse Weather Conditions

Weather Prediction with Diffusion Guided by Realistic Forecast Processes

Diffusion-Geo: A Two-Stage Controllable Text-To-Image Generative Model for Remote Sensing Scenarios

3D Semantic Segmentation in the Wild: Learning Generalized Models for Adverse-Condition Point Clouds

Robust semantic segmentation method of urban scenes in snowy environment

Improving Synthetic to Realistic Semantic Segmentation with Parallel Generative Ensembles for Autonomous Urban Driving

Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather