ClimateLearn: Benchmarking Machine Learning for Weather and Climate Modeling

Tung Nguyen,Jason Jewik,Hritik Bansal,Prakhar Sharma,Aditya Grover
2023-07-05
Abstract:Modeling weather and climate is an essential endeavor to understand the near- and long-term impacts of climate change, as well as inform technology and policymaking for adaptation and mitigation efforts. In recent years, there has been a surging interest in applying data-driven methods based on machine learning for solving core problems such as weather forecasting and climate downscaling. Despite promising results, much of this progress has been impaired due to the lack of large-scale, open-source efforts for reproducibility, resulting in the use of inconsistent or underspecified datasets, training setups, and evaluations by both domain scientists and artificial intelligence researchers. We introduce ClimateLearn, an open-source PyTorch library that vastly simplifies the training and evaluation of machine learning models for data-driven climate science. ClimateLearn consists of holistic pipelines for dataset processing (e.g., ERA5, CMIP6, PRISM), implementation of state-of-the-art deep learning models (e.g., Transformers, ResNets), and quantitative and qualitative evaluation for standard weather and climate modeling tasks. We supplement these functionalities with extensive documentation, contribution guides, and quickstart tutorials to expand access and promote community growth. We have also performed comprehensive forecasting and downscaling experiments to showcase the capabilities and key features of our library. To our knowledge, ClimateLearn is the first large-scale, open-source effort for bridging research in weather and climate modeling with modern machine learning systems. Our library is available publicly at <a class="link-external link-https" href="https://github.com/aditya-grover/climate-learn" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The main goal of this paper is to address several key issues in climate change research, particularly in improving weather forecasting, climate downscaling, and climate prediction using machine learning techniques. Specifically: 1. **Simplify Training and Evaluation Processes**: By introducing an open-source PyTorch library named ClimateLearn, the training and evaluation process of machine learning models on climate science data is simplified. 2. **Provide Benchmarking**: A comprehensive benchmarking framework is provided for tasks such as weather forecasting, climate downscaling, and climate prediction. This framework covers data preprocessing tools, popular deep learning models, and traditional baseline methods, and supports result quantification and visualization. 3. **Address Existing Issues**: The paper addresses the current lack of large-scale, open-source efforts in climate modeling, which leads to inconsistencies or poorly defined datasets, training setups, and evaluation methods. 4. **Promote Community Collaboration**: By offering detailed documentation, contribution guidelines, and quick-start tutorials, the paper aims to expand access for climate science researchers and foster community growth. In summary, this paper aims to bridge the research gap between current weather and climate modeling and modern machine learning systems by building a comprehensive and user-friendly tool library, thereby advancing research in the relevant fields.