Foundation Models for Generalist Geospatial Artificial Intelligence

Johannes Jakubik,Sujit Roy,C. E. Phillips,Paolo Fraccaro,Denys Godwin,Bianca Zadrozny,Daniela Szwarcman,Carlos Gomes,Gabby Nyirjesy,Blair Edwards,Daiki Kimura,Naomi Simumba,Linsong Chu,S. Karthik Mukkavilli,Devyani Lambhate,Kamal Das,Ranjini Bangalore,Dario Oliveira,Michal Muszynski,Kumar Ankur,Muthukumaran Ramasubramanian,Iksha Gurung,Sam Khallaghi,Hanxi,Michael Cecil,Maryam Ahmadi,Fatemeh Kordi,Hamed Alemohammad,Manil Maskey,Raghu Ganti,Kommy Weldemariam,Rahul Ramachandran

2023-11-09

Abstract:Significant progress in the development of highly adaptable and reusable Artificial Intelligence (AI) models is expected to have a significant impact on Earth science and remote sensing. Foundation models are pre-trained on large unlabeled datasets through self-supervision, and then fine-tuned for various downstream tasks with small labeled datasets. This paper introduces a first-of-a-kind framework for the efficient pre-training and fine-tuning of foundational models on extensive geospatial data. We have utilized this framework to create Prithvi, a transformer-based geospatial foundational model pre-trained on more than 1TB of multispectral satellite imagery from the Harmonized Landsat-Sentinel 2 (HLS) dataset. Our study demonstrates the efficacy of our framework in successfully fine-tuning Prithvi to a range of Earth observation tasks that have not been tackled by previous work on foundation models involving multi-temporal cloud gap imputation, flood mapping, wildfire scar segmentation, and multi-temporal crop segmentation. Our experiments show that the pre-trained model accelerates the fine-tuning process compared to leveraging randomly initialized weights. In addition, pre-trained Prithvi compares well against the state-of-the-art, e.g., outperforming a conditional GAN model in multi-temporal cloud imputation by up to 5pp (or 5.7%) in the structural similarity index. Finally, due to the limited availability of labeled data in the field of Earth observation, we gradually reduce the quantity of available labeled data for refining the model to evaluate data efficiency and demonstrate that data can be decreased significantly without affecting the model's accuracy. The pre-trained 100 million parameter model and corresponding fine-tuning workflows have been released publicly as open source contributions to the global Earth sciences community through Hugging Face.

Computer Vision and Pattern Recognition,Machine Learning

What problem does this paper attempt to address?

The problems that this paper attempts to solve are the challenges in developing and deploying general artificial intelligence (AI) models in the fields of earth science and remote sensing. Specifically, the paper focuses on the following points: 1. **Data availability and processing cost**: With the increase in the amount of remote sensing data, exploring and processing unlabeled data has become a major obstacle in research. Traditional supervised learning methods require a large amount of labeled data, which is both expensive and time - consuming in the field of earth science. 2. **Limitations of task - specific models**: Most of the current AI models for earth science and remote sensing are designed for specific tasks. These models have poor generalization ability in space and time, and the models need to be rebuilt every time a new task is applied. 3. **Application of self - supervised learning**: The paper explores how to use self - supervised learning methods to pre - train large - scale unlabeled data sets, and then fine - tune the models with a small amount of labeled data to effectively support a variety of downstream tasks. 4. **Utilization of multi - sensor data**: The paper proposes a framework aimed at efficiently pre - training and fine - tuning the base models, using multi - sensor data (such as multispectral satellite images) to solve various tasks in earth observation, including multitemporal cloud filling, flood mapping, fire scar segmentation, and multitemporal crop segmentation. Through the solution of these problems, the paper hopes to accelerate the development and deployment of climate and sustainability applications and promote the progress of AI technology in the field of earth science.

Foundation Models for Generalist Geospatial Artificial Intelligence

Geospatial foundation models for image analysis: evaluating and enhancing NASA-IBM Prithvi's domain adaptability

Prithvi-EO-2.0: A Versatile Multi-Temporal Foundation Model for Earth Observation Applications

AI Foundation Models in Remote Sensing: A Survey

Pretraining Billion-scale Geospatial Foundational Models on Frontier

Generative ConvNet Foundation Model With Sparse Modeling and Low-Frequency Reconstruction for Remote Sensing Image Interpretation

When are Foundation Models Effective? Understanding the Suitability for Pixel-Level Classification Using Multispectral Imagery

On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence

When Geoscience Meets Foundation Models: Towards General Geoscience Artificial Intelligence System

SpectralGPT: Spectral Remote Sensing Foundation Model

On the Opportunities and Challenges of Foundation Models for GeoAI (Vision Paper)

SatVision-TOA: A Geospatial Foundation Model for Coarse-Resolution All-Sky Remote Sensing Imagery

A Billion-scale Foundation Model for Remote Sensing Images

Towards a Knowledge guided Multimodal Foundation Model for Spatio-Temporal Remote Sensing Applications

AI Foundation Models for Weather and Climate: Applications, Design, and Implementation

OReole-FM: successes and challenges toward billion-parameter foundation models for high-resolution satellite imagery

Foundation Models in Robotics: Applications, Challenges, and the Future

Towards Geospatial Foundation Models via Continual Pretraining

SpectralEarth: Training Hyperspectral Foundation Models at Scale

Evaluating and Benchmarking Foundation Models for Earth Observation and Geospatial AI