Flood mapping based on novel ensemble modeling involving the deep learning, Harris Hawk optimization algorithm and stacking based machine learning

Romulus Costache,Subodh Chandra Pal,Chaitanya B. Pande,Abu Reza Md. Towfiqul Islam,Fahad Alshehri,Hazem Ghassan Abdo
DOI: https://doi.org/10.1007/s13201-024-02131-4
IF: 5.411
2024-03-14
Applied Water Science
Abstract:Abstract Among the various natural disasters that take place around the world, flood is considered to be the most extensive. There have been several floods in Buzău river basin, and as a result of this, the area has been chosen as the study area. For the purpose of this research, we applied deep learning and machine learning benchmarks in order to prepare flood potential maps at the basin scale. In this regard 12 flood predictors, 205 flood and 205 non-flood locations were used as input data into the following 3 complex models: Deep Learning Neural Network-Harris Hawk Optimization-Index of Entropy (DLNN-HHO-IOE), Multilayer Perceptron-Harris Hawk Optimization-Index of Entropy (MLP-HHO-IOE) and Stacking ensemble-Harris Hawk Optimization-Index of Entropy (Stacking-HHO-IOE). The flood sample was divided into training (70%) and validating (30%) sample, meanwhile the prediction ability of flood conditioning factors was tested through the Correlation-based Feature Selection method. ROC Curve and statistical metrics were involved in the results validation. The modeling process through the stated algorithms showed that the most important flood predictors are represented by: slope (importance ≈ 20%), distance from river (importance ≈ 17.5%), land use (importance ≈ 12%) and TPI (importance ≈ 10%). The importance values were used to compute the flood susceptibility, while Natural Breaks method was used to classify the results. The high and very high flood susceptibility is spread on approximately 35–40% of the study zone. The ROC Curve, in terms of Success, Rate shows that the highest performance was achieved FPI DLNN-HHO-IOE (AUC = 0.97), followed by FPI Stacking-HHO-IOE (AUC = 0.966) and FPI MLP-HHO-IOE (AUC = 0.953), while the Prediction Rate indicates the FPI Stacking-HHO-IOE as being the most performant model with an AUC of 0.977, followed by FPI DLNN-HHO-IOE (AUC = 0.97) and FPI MLP-HHO-IOE (AUC = 0.924).
water resources
What problem does this paper attempt to address?
This paper attempts to address the issue of flood susceptibility mapping in the Buzău River Basin, Romania. Specifically, the study aims to apply deep learning and machine learning benchmark methods to prepare potential flood maps at the basin scale. The study used 12 flood prediction factors, 205 flood locations, and 205 non-flood locations as input data to construct three complex models: Deep Learning Neural Network-Harris Hawk Optimization-Entropy Index (DLNN-HHO-IOE), Multilayer Perceptron-Harris Hawk Optimization-Entropy Index (MLP-HHO-IOE), and Stacking Ensemble-Harris Hawk Optimization-Entropy Index (Stacking-HHO-IOE). Through these models, the study attempts to assess the importance of different flood condition factors and generate high-precision flood susceptibility maps to help reduce flood risk and vulnerability. ### Main Research Objectives: 1. **Apply advanced deep learning and machine learning techniques**: Combine the Harris Hawk Optimization algorithm and entropy index to construct flood susceptibility prediction models. 2. **Evaluate the importance of flood prediction factors**: Determine which factors are most important for flood prediction through feature selection methods. 3. **Generate high-precision flood susceptibility maps**: Validate the model's performance using ROC curves and statistical indicators, and classify the results into different levels of flood susceptibility areas. ### Research Background: - Floods are one of the most common natural disasters globally, causing severe impacts on the environment and human life. - The Buzău River Basin has historically experienced multiple flood events, making it a suitable study area. - The topographical features, geological structure, and land use of the study area significantly influence the occurrence and development of floods. ### Methods and Techniques: - **Data Collection**: Includes data on flood locations, non-flood locations, and 12 flood prediction factors. - **Model Construction**: Uses DLNN, MLP, and stacking ensemble models, optimized through the Harris Hawk Optimization algorithm. - **Feature Selection**: Employs correlation-based feature selection methods to evaluate the importance of each prediction factor. - **Model Validation**: Assesses model performance using ROC curves and statistical indicators. ### Results and Conclusions: - **Most Important Flood Prediction Factors**: Slope (approximately 20%), distance to river (approximately 17.5%), land use (approximately 12%), and topographic position index (approximately 10%). - **Model Performance**: The DLNN-HHO-IOE model performed best in terms of success rate (AUC = 0.97), while the Stacking-HHO-IOE model performed best in terms of prediction rate (AUC = 0.977). - **Flood Susceptibility Maps**: High and very high flood susceptibility areas account for approximately 35%-40% of the study area. Through this research, the paper provides scientific basis and technical support for reducing flood risk and enhancing flood prevention capabilities.