Statistical Downscaling via High-Dimensional Distribution Matching with Generative Models

Zhong Yi Wan,Ignacio Lopez-Gomez,Robert Carver,Tapio Schneider,John Anderson,Fei Sha,Leonardo Zepeda-Núñez
2024-12-11
Abstract:Statistical downscaling is a technique used in climate modeling to increase the resolution of climate simulations. High-resolution climate information is essential for various high-impact applications, including natural hazard risk assessment. However, simulating climate at high resolution is intractable. Thus, climate simulations are often conducted at a coarse scale and then downscaled to the desired resolution. Existing downscaling techniques are either simulation-based methods with high computational costs, or statistical approaches with limitations in accuracy or application specificity. We introduce Generative Bias Correction and Super-Resolution (GenBCSR), a two-stage probabilistic framework for statistical downscaling that overcomes the limitations of previous methods. GenBCSR employs two transformations to match high-dimensional distributions at different resolutions: (i) the first stage, bias correction, aligns the distributions at coarse scale, (ii) the second stage, statistical super-resolution, lifts the corrected coarse distribution by introducing fine-grained details. Each stage is instantiated by a state-of-the-art generative model, resulting in an efficient and effective computational pipeline for the well-studied distribution matching problem. By framing the downscaling problem as distribution matching, GenBCSR relaxes the constraints of supervised learning, which requires samples to be aligned. Despite not requiring such correspondence, we show that GenBCSR surpasses standard approaches in predictive accuracy of critical impact variables, particularly in predicting the tails (99% percentile) of composite indexes composed of interacting variables, achieving up to 4-5 folds of error reduction.
Machine Learning,Numerical Analysis,Atmospheric and Oceanic Physics
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the **statistical downscaling problem** in climate simulation. Specifically, the goal is to convert the low - resolution global climate model (GCM) output into high - resolution regional climate information. This problem is crucial in climate research because high - resolution climate data is of great significance for assessing natural disaster risks, extreme weather events, etc. #### Main challenges 1. **High computational cost**: The computational complexity of high - resolution climate simulation is extremely high, and it can usually only be simulated on a coarse scale and then the resolution is improved through downscaling methods. 2. **Limitations of existing methods**: - **Physical simulation methods**: Although accurate, they are very computationally expensive, which limits the availability of data and geographical coverage. - **Statistical methods**: Although computationally efficient, they have limitations in terms of accuracy or application specificity, especially in capturing the spatio - temporal correlations between multiple variables and extreme events. #### Solutions The paper proposes a two - stage probabilistic framework named **Generative Bias Correction and Super - Resolution (GenBCSR)** for statistical downscaling. GenBCSR solves the above problems through the following two steps: 1. **Bias Correction**: - Eliminate the biases introduced by the coarse model by aligning the distributions of low - resolution data. - Use the latest generation models to achieve efficient and effective bias correction. 2. **Statistical Super - Resolution**: - Introduce fine - grained details to upgrade the corrected low - resolution distribution to high - resolution. - Also use advanced generation models to achieve super - resolution, ensuring that the generated high - resolution data is spatio - temporally consistent. #### Key innovation points - **Framing of the distribution - matching problem**: Redefine the downscaling problem as high - dimensional distribution - matching at different resolutions, thereby relaxing the requirements for sample pairing in supervised learning. - **Application of generation models**: Leverage the powerful capabilities of generation models to avoid the high computational cost of physical simulation and more accurately capture the tail distribution of extreme events. Through these improvements, GenBCSR not only performs well in predicting key impact variables (such as the 99th percentile of the composite index), but also significantly outperforms traditional methods in dealing with extreme weather events (such as heat waves, hurricanes, etc.), achieving a 4 - 5 times reduction in error. ### Summary By introducing the GenBCSR framework, this paper effectively solves the deficiencies of existing downscaling methods in terms of computational cost and accuracy, provides a new approach for downscaling large - scale climate ensembles, and is helpful for local climate risk assessment.