Debris flow susceptibility assessment based on boosting ensemble learning techniques: a case study in the Tumen River basin, China
Zelu Chen,Hechun Quan,Ri Jin,Zhehao Lin,Guangzhu Jin
DOI: https://doi.org/10.1007/s00477-024-02683-6
IF: 3.821
2024-03-04
Stochastic Environmental Research and Risk Assessment
Abstract:Debris flow has always been a serious problem in mountainous areas. Accurate debris flow susceptibility (DFS) assessment and interpretable prediction results play an important role in the prevention and control of debris flow disasters. Some commonly used machine learning algorithms based on Boosting ensemble techniques were widely used in the study of geohazard susceptibility due to its excellent predictive ability. However, the Categorical Boosting (CatBoost) and Natural Gradient Boosting (NGBoost) have not yet been applied in the field of DFS assessment, and few geohazard studies systematically compare and research these boosting-based algorithms. Meanwhile, previous researches have mostly focused on comparing the predictive ability of algorithms, identifying the susceptibility zones of the entire study area, and ranking the importance of the indicators, but little thorough analysis of the relationship between the indicators and debris flow susceptibility on different types of construction land. The aims of this study were to explore the optimal boosting-based DFS model, and the distribution characteristics and change rules of DFS in the study area, so as to provide decision supports for debris flow disaster prevention and reduction. This was the first time that six boosting-based machine learning algorithms have been compared in the study of DFS assessment. After determining the optimal model, the change rules of indicators in the entire study area and two types of construction lands under different DFS levels were studied respectively. An eXplainable Artifcial Intelligence (XAI) method called SHapley Additive exPlantations (SHAP), combined with zonal statistics function in geographic information system (GIS) were adopted to explore how each indicator affects the occurrence of debris flows. The results showed that the CatBoost performed best and provided the most reasonable DFS result among six boosting-based models. We found that debris flows were more likely to occur along rivers and construction lands at low altitude. Rural areas faced more stronger pressure from rainfall and were featured by worse disaster-breeding environment than urban areas. This research enriches the application of machine learning in DFS assessment, explores the changing trends of indicators between different DFS levels, and provides suggestions for better debris flow disaster prevention and mitigation management.
environmental sciences,engineering, environmental,water resources, civil,statistics & probability