Machine Learning for Generalizable Prediction of Flood Susceptibility

Chelsea Sidrane,Dylan J Fitzpatrick,Andrew Annex,Diane O'Donoghue,Yarin Gal,Piotr Biliński
DOI: https://doi.org/10.48550/arXiv.1910.06521
2019-10-15
Abstract:Flooding is a destructive and dangerous hazard and climate change appears to be increasing the frequency of catastrophic flooding events around the world. Physics-based flood models are costly to calibrate and are rarely generalizable across different river basins, as model outputs are sensitive to site-specific parameters and human-regulated infrastructure. In contrast, statistical models implicitly account for such factors through the data on which they are trained. Such models trained primarily from remotely-sensed Earth observation data could reduce the need for extensive in-situ measurements. In this work, we develop generalizable, multi-basin models of river flooding susceptibility using geographically-distributed data from the USGS stream gauge network. Machine learning models are trained in a supervised framework to predict two measures of flood susceptibility from a mix of river basin attributes, impervious surface cover information derived from satellite imagery, and historical records of rainfall and stream height. We report prediction performance of multiple models using precision-recall curves, and compare with performance of naive baselines. This work on multi-basin flood prediction represents a step in the direction of making flood prediction accessible to all at-risk communities.
Machine Learning,Applications
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the accuracy and universality of flood prediction, especially the generalization ability between different river basins. Specifically, the authors hope to develop models capable of predicting flood susceptibility across multiple river basins by using machine - learning methods, thereby reducing the dependence on expensive and time - consuming physical models and on - site measurement data. This will help to provide more rapid and cost - effective flood warning systems for more communities, especially those areas lacking resources for disaster planning and response. ### Main problem decomposition: 1. **Limitations of existing physical models**: - Physical models need to be calibrated for specific geographical areas and are difficult to generalize between different river basins. - Physical models require data on detailed human - regulated infrastructure (such as dams), increasing the time and resource costs. - Physical models rely on expensive on - site measurement devices (such as USGS current meters), with each device having an annual maintenance cost of up to $7,000 to $15,000. 2. **Advantages of statistical models**: - Statistical models can implicitly capture the impact of human - regulated infrastructure through training data without the need for explicit modeling. - Statistical models can utilize remote - sensing data (such as satellite images) to reduce the dependence on on - site measurement data. - Statistical models can be generalized between multiple river basins, providing broader applicability. 3. **Research objectives**: - Develop flood susceptibility prediction models for multiple river basins, using geographically distributed data (such as the USGS current meter network). - Evaluate the importance of different features (such as historical river heights, rainfall forecasts, etc.) for prediction accuracy. - Compare the performance of statistical models with the existing NOAA physical models in flood prediction. ### Specific contributions of the paper: - Propose a flood susceptibility prediction framework for multiple river basins, using machine - learning models (such as random forests, gradient - boosted decision trees, multi - layer perceptrons) for prediction. - Verify the performance of these models in different scenarios through experiments, including unmonitored locations, locations with historical data, and situations with accurate rainfall forecasts. - The results show that statistical models may significantly outperform the existing NOAA physical models in some cases, especially when only one month of historical data is available. In conclusion, this paper aims to develop more universal and cost - effective flood prediction models through machine - learning techniques to help more communities better cope with flood disasters.