Foundation Models for Weather and Climate Data Understanding: A Comprehensive Survey

Shengchao Chen,Guodong Long,Jing Jiang,Dikai Liu,Chengqi Zhang
2023-12-05
Abstract:As artificial intelligence (AI) continues to rapidly evolve, the realm of Earth and atmospheric sciences is increasingly adopting data-driven models, powered by progressive developments in deep learning (DL). Specifically, DL techniques are extensively utilized to decode the chaotic and nonlinear aspects of Earth systems, and to address climate challenges via understanding weather and climate data. Cutting-edge performance on specific tasks within narrower spatio-temporal scales has been achieved recently through DL. The rise of large models, specifically large language models (LLMs), has enabled fine-tuning processes that yield remarkable outcomes across various downstream tasks, thereby propelling the advancement of general AI. However, we are still navigating the initial stages of crafting general AI for weather and climate. In this survey, we offer an exhaustive, timely overview of state-of-the-art AI methodologies specifically engineered for weather and climate data, with a special focus on time series and text data. Our primary coverage encompasses four critical aspects: types of weather and climate data, principal model architectures, model scopes and applications, and datasets for weather and climate. Furthermore, in relation to the creation and application of foundation models for weather and climate data understanding, we delve into the field's prevailing challenges, offer crucial insights, and propose detailed avenues for future research. This comprehensive approach equips practitioners with the requisite knowledge to make substantial progress in this domain. Our survey encapsulates the most recent breakthroughs in research on large, data-driven models for weather and climate data understanding, emphasizing robust foundations, current advancements, practical applications, crucial resources, and prospective research opportunities.
Machine Learning,Artificial Intelligence,Computer Vision and Pattern Recognition,Atmospheric and Oceanic Physics
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is how to use deep learning and large - scale models (Foundation Models) to understand and process meteorological and climate data. Specifically, the paper aims to explore the following issues: 1. **Improving the accuracy of understanding meteorological and climate data**: Traditional numerical weather prediction models (NWP) and global circulation models (GCMs) have limitations in dealing with local geographical features, multi - source observational data fusion, and computational resource requirements. The paper explores how to improve these deficiencies through deep learning techniques, especially large - scale pre - trained models. 2. **Developing general - purpose Foundation Models**: Existing deep - learning models are often designed for specific tasks, and the training data format is single, resulting in overly fragmented functions. The paper discusses how to build a general - purpose Foundation Model that can flexibly adapt to multiple downstream tasks, thereby achieving a more comprehensive simulation of the global meteorological and climate system. 3. **Addressing challenges in the application of large - scale models**: Although large models have made significant progress in fields such as natural language processing and computer vision, in the meteorological and climate fields, they still face challenges such as low - quality data sets and high computational resource requirements. The paper analyzes these challenges and proposes future research directions and development opportunities. 4. **Enhancing the interpretability and reliability of models**: In meteorological and climate prediction, incorrect predictions can have a serious impact on ecosystems and society. Therefore, the interpretability of models is particularly important. The paper emphasizes this issue and explores how to enhance the transparency and credibility of models. In summary, this paper hopes to provide theoretical support and technical guidance for the development of more efficient, accurate, and general - purpose meteorological and climate Foundation Models by systematically reviewing and analyzing existing deep - learning models and their applications, so as to better cope with the challenges brought by climate change. ### Formula Examples In meteorological and climate data processing, common formulas include: - **Atmospheric Dynamics Equation**: \[ \frac{\partial u}{\partial t}+u\cdot\nabla u = -\frac{1}{\rho}\nabla p + g+\nu\nabla^{2}u \] where \(u\) is the wind speed, \(p\) is the air pressure, \(\rho\) is the density, \(g\) is the gravitational acceleration, and \(\nu\) is the viscosity coefficient. - **Thermodynamic Energy Conservation Equation**: \[ \frac{\partial T}{\partial t}+u\cdot\nabla T = Q+\kappa\nabla^{2}T \] where \(T\) is the temperature, \(Q\) is the heat source term, and \(\kappa\) is the heat conduction coefficient. These formulas are usually used for physical constraints or incorporated into model design as prior knowledge in deep - learning models.