Abstract:Wildfires are a significant threat to ecosystems and human infrastructure, leading to widespread destruction and environmental degradation. Recent advancements in deep learning and generative models have enabled new methods for wildfire detection and monitoring. However, the scarcity of annotated wildfire images limits the development of robust models for these tasks. In this work, we present the FLAME Diffuser, a training-free, diffusion-based framework designed to generate realistic wildfire images with paired ground truth. Our framework uses augmented masks, sampled from real wildfire data, and applies Perlin noise to guide the generation of realistic flames. By controlling the placement of these elements within the image, we ensure precise integration while maintaining the original images style. We evaluate the generated images using normalized Frechet Inception Distance, CLIP Score, and a custom CLIP Confidence metric, demonstrating the high quality and realism of the synthesized wildfire images. Specifically, the fusion of Perlin noise in this work significantly improved the quality of synthesized images. The proposed method is particularly valuable for enhancing datasets used in downstream tasks such as wildfire detection and monitoring.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: In wildfire detection and monitoring tasks, the scarcity of labeled wildfire image data restricts the training and performance improvement of deep - learning models. To solve this problem, the authors propose FLAME Diffuser, a diffusion - model - based framework aimed at generating realistic wildfire images and their corresponding ground truth. By using enhanced masks and Perlin noise to guide the generation of flames, this method can precisely control the position and appearance of flames while maintaining the style of the original image. This not only improves the quality and diversity of synthetic images but also provides rich data - enhancement means for downstream tasks such as wildfire detection and monitoring. ### Main contributions of the paper: 1. **Propose FLAME Diffuser**: This is a diffusion - model framework without training, used to generate wildfire images with ground truth. The framework utilizes style images and enhanced masks to provide precise control over the placement and integration of wildfire elements (such as flames). 2. **Mask - enhancement method**: A method for generating masks with different complexity, geometric shapes and textures is proposed, enhancing the diversity and realism of synthetic wildfire elements. 3. **Automated annotation method**: Use CLIP for automated annotation, showing how to apply CLIP confidence to evaluate the success rate and accuracy of image synthesis. ### Key technologies of the solution: - **Mask generation**: Generate random masks through mathematical algorithms and enhance the dynamics and diversity of masks through image - processing techniques such as color transformation, noise addition and domain distortion. - **Mask - image diffusion**: Merge the enhanced mask with the real image to form a new composite input image. Encode the composite image into a latent representation through a pre - trained variational auto - encoder (VAE), and then denoise it through a pre - trained diffusion model (such as SDv1.5). During the denoise process, the mask and text prompt jointly guide the generation of flames, ensuring that the flames appear in the specified area and are coordinated with the surrounding environment. - **Evaluation metrics**: Use the normalized Fréchet Inception Distance (nFID), CLIP Score and custom - defined CLIP confidence metrics to evaluate the quality and effectiveness of the generated images. ### Experimental results: - **Qualitative evaluation**: Compared with the baseline method, mask - guided image diffusion can more precisely control the position of flames, and the generated flames are seamlessly integrated with the background environment, being more natural and realistic. - **Quantitative evaluation**: Through the comprehensive evaluation of nFID, CLIP Score and CLIP confidence metrics, the Perlin mask method performs best in terms of image quality, diversity and context preservation, and the generated flames have the highest quality and are the most realistic. In conclusion, FLAME Diffuser effectively solves the problem of scarcity of wildfire image data by introducing a mask - guided diffusion model, providing high - quality data - enhancement means for wildfire detection and monitoring tasks.

FLAME Diffuser: Wildfire Image Synthesis using Mask Guided Diffusion

Vision Based Flame Detection Using Compressed Domain Motion Prediction and Multi-Feature Fusion.

FireDetXplainer: Decoding Wildfire Detection With Transparency and Explainable AI Insights

Aerial imagery pile burn detection using deep learning: The FLAME dataset

Flame Detection Using Deep Learning

Flame Detection in Electric Power Scenarios Based on Gaussian Modeling

Fully Synthetic Videos and the Random-Background-Pasting Method for Flame Segmentation

A deep learning-based dynamic deformable adaptive framework for locating the root region of the dynamic flames

Wildfire Smoke Detection with Cross Contrast Patch Embedding

Modelling flame-to-fuel heat transfer by deep learning and fire images

A Deep Learning Framework: Predicting Fire Radiative Power From the Combination of Polar-Orbiting and Geostationary Satellite Data During Wildfire Spread

An Intelligent Wildfire Detection Approach through Cameras Based on Deep Learning

Fire image enhancement method based on generative adversarial networks for improving fire detection performance through cameras

Multi-Scale Video Flame Detection for Early Fire Warning Based on Deep Learning

FGL-GAN: Global-Local Mask Generative Adversarial Network for Flame Image Composition

FLAME 3 Dataset: Unleashing the Power of Radiometric Thermal UAV Imagery for Wildfire Management

Defogging Learning Based on an Improved DeepLabV3+ Model for Accurate Foggy Forest Fire Segmentation

Fire-Image-DenseNet (FIDN) for predicting wildfire burnt area using remote sensing data

YOLOv5-CSF: an improved deep convolutional neural network for flame detection

FireXplainNet: Optimizing Convolution Block Architecture for Enhanced Wildfire Detection and Interpretability

Obscured Wildfire Flame Detection By Temporal Analysis of Smoke Patterns Captured by Unmanned Aerial Systems