FMARS: Annotating Remote Sensing Images for Disaster Management using Foundation Models

Edoardo Arnaudo,Jacopo Lungo Vaschetti,Lorenzo Innocenti,Luca Barco,Davide Lisi,Vanina Fissore,Claudio Rossi
2024-06-20
Abstract:Very-High Resolution (VHR) remote sensing imagery is increasingly accessible, but often lacks annotations for effective machine learning applications. Recent foundation models like GroundingDINO and Segment Anything (SAM) provide opportunities to automatically generate annotations. This study introduces FMARS (Foundation Model Annotations in Remote Sensing), a methodology leveraging VHR imagery and foundation models for fast and robust annotation. We focus on disaster management and provide a large-scale dataset with labels obtained from pre-event imagery over 19 disaster events, derived from the Maxar Open Data initiative. We train segmentation models on the generated labels, using Unsupervised Domain Adaptation (UDA) techniques to increase transferability to real-world scenarios. Our results demonstrate the effectiveness of leveraging foundation models to automatically annotate remote sensing data at scale, enabling robust downstream models for critical applications. Code and dataset are available at \url{<a class="link-external link-https" href="https://github.com/links-ads/igarss-fmars" rel="external noopener nofollow">this https URL</a>}.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **Very - High - Resolution (VHR) remote sensing images lack high - quality annotations, which limits their effective use in supervised machine - learning applications**. Specifically, although VHR remote - sensing images are increasingly accessible, these images usually lack sufficient annotation information, especially in critical application scenarios such as disaster management. To solve this problem, the authors propose **FMARS (Foundation Model Annotations in Remote Sensing)**, a method for automatically generating large - scale annotations using foundation models. By combining VHR images with foundation models (such as GroundingDINO and Segment Anything), FMARS can quickly and robustly generate annotation data, thus supporting model training for downstream tasks (such as semantic segmentation and instance segmentation). ### Main problems and solutions 1. **Problems**: - Although VHR remote - sensing images are easily accessible, they lack high - quality annotations. - Existing annotation methods are inefficient and costly and are difficult to apply on a large scale. 2. **Solutions**: - **Introducing foundation models**: Utilize foundation models such as GroundingDINO and Segment Anything, which can provide powerful image classification, object detection, and segmentation capabilities without additional fine - tuning. - **Automating the annotation process**: Design an automated workflow to generate annotations from pre - event images, including three types of objects: buildings, roads, and high - vegetation. - **Constructing a large - scale dataset**: Based on the images of 19 disaster events provided by the Maxar Open Data program, construct a large - scale dataset containing more than 25 million annotations. - **Applying UDA technology**: In order to improve the quality of the generated annotations and the robustness of the model, unsupervised domain adaptation (UDA) technologies such as DAFormer and Masked Image Consistency (MIC) are adopted. ### Experimental verification To verify the effectiveness of FMARS, the authors trained the state - of - the - art segmentation models on the generated annotations and compared them with the manually annotated data. The experimental results show that even in a fully automated annotation process, the application of UDA technology significantly improves the performance of the model, especially when dealing with the challenging high - vegetation category. ### Conclusion FMARS provides an effective method for automatically generating high - quality annotations using foundation models, thereby solving the problem of insufficient annotation of VHR remote - sensing images. This method not only improves the annotation efficiency but also enables the training of accurate downstream segmentation models without a large amount of manual annotation.