Abstract:Machine learning (ML)-based methods of landslide susceptibility assessment primarily focus on two dimensions: accuracy and complexity. The complexity is not only influenced by specific model frameworks but also by the type and complexity of the modeling data. Therefore, considering the impact of factor data types on the model's decision-making mechanism holds significant importance in assessing regional landslide characteristics and conducting landslide risk warnings given the achievement of good predictive performance for landslide susceptibility using excellent ML methods. The decision-making mechanism of landslide susceptibility models coupled with different types of factor data in machine learning methods was explained in this study by utilizing the Shapley Additive exPlanations (SHAP) method. Furthermore, a comparative analysis was carried out to examine the differential effects of diverse data types for identical factors on model predictions. The study area selected was Cenxi, Guangxi, where a geographic spatial database was constructed by combining 23 landslide conditioning factors with 214 landslide samples from the region. Initially, the factors were standardized using five conditional probability models, frequency ratio (FR), information value (IV), certainty factor (CF), evidential belief function (EBF), and weights of evidence (WOE), based on the spatial arrangement of landslides. This led to the formation of six types of factor databases using the initial data. Subsequently, two ensemble-based ML methods, random forest (RF) and XGBoost, were utilized to build models for predicting landslide susceptibility. Various evaluation metrics were employed to compare the predictive capabilities of different models and determined the optimal model. Simultaneously, the analysis was conducted using the interpretable SHAP method for intrinsic decision-making mechanisms of different ensemble-based ML models, with a specific focus on explaining and comparing the differential impacts of different types of factor data on prediction results. The results of the study illustrated that the XGBoost-CF model constructed with CF values of factors not only exhibited the best predictive accuracy and stability but also yielded more reasonable results for landslide susceptibility zoning, and was thus identified as the optimal model. The global interpretation results revealed that slope was the most crucial factor influencing landslides, and its interaction with other factors in the study area collectively contributed to landslide occurrences. The differences in the internal decision-making mechanisms of models based on different data types for the same factors primarily manifested in the extent of influence on prediction results and the dependency of factors, providing an explanation for the performance of standardized data in ML models and the reasons behind the higher predictive performance of coupled models based on conditional probability models and ML methods. Through comprehensive analysis of the local interpretation results from different models analyzing the same sample with different sample characteristics, the reasons for model prediction errors can be summarized, thereby providing a reference framework for constructing more accurate and rational landslide susceptibility models and facilitating landslide warning and management.

Interpretability of Statistical, Machine Learning, and Deep Learning Models for Landslide Susceptibility Mapping in Three Gorges Reservoir Area

Selection of contributing factors for predicting landslide susceptibility using machine learning and deep learning models

Utilizing deep learning approach to develop landslide susceptibility mapping considering landslide types

Essential insights into decision mechanism of landslide susceptibility mapping based on different machine learning models

Exploring the spatial patterns of landslide susceptibility assessment using interpretable Shapley method: Mechanisms of landslide formation in the Sichuan-Tibet region

Investigation of Landslide Susceptibility Decision Mechanisms in Different Ensemble-Based Machine Learning Models with Various Types of Factor Data

Synergizing multiple machine learning techniques and remote sensing for advanced landslide susceptibility assessment: a case study in the Three Gorges Reservoir Area

Exploring Complementary Models Consisting of Machine Learning Algorithms for Landslide Susceptibility Mapping

Landslide Susceptibility Mapping and Driving Mechanisms in a Vulnerable Region Based on Multiple Machine Learning Models

Landslide susceptibility mapping in Three Gorges Reservoir area based on GIS and boosting decision tree model

Evaluation of Landslide Susceptibility Using Machine Learning Based on Information Value Sampling Method

Landslide Susceptibility Assessment Model Construction Using Typical Machine Learning for the Three Gorges Reservoir Area in China

Landslide Susceptibility Mapping Based on Interpretable Machine Learning from the Perspective of Geomorphological Differentiation

Enhancing landslide susceptibility mapping incorporating landslide typology via stacking ensemble machine learning in Three Gorges Reservoir, China

Comparative Assessment of the Efficacy of the Five Kinds of Models in Landslide Susceptibility Map for Factor Screening: A Case Study at Zigui-Badong in the Three Gorges Reservoir Area, China

Hybrid Integration Approach of Entropy with Logistic Regression and Support Vector Machine for Landslide Susceptibility Modeling

Landslide Susceptibility Prediction Based on Remote Sensing Images and GIS: Comparisons of Supervised and Unsupervised Machine Learning Models

Landslide Susceptibility Prediction Using Machine Learning Methods: A Case Study of Landslides in the Yinghu Lake Basin in Shaanxi

Landslide Susceptibility Prediction Modeling Based on Self-Screening Deep Learning Model

Combining spatial response features and machine learning classifiers for landslide susceptibility mapping

Improving pixel-based regional landslide susceptibility mapping