Random forest-based screening of environmental geohazard probability factors in Panshi city, China

Lihui Qi,Xuedong Wang,Cui Wang,Haipeng Wang,Xiaolong Li
DOI: https://doi.org/10.1016/j.asr.2024.09.055
IF: 2.611
2024-09-30
Advances in Space Research
Abstract:Environmental geohazard probabilities are considerably affected by multiple factors, and the reasonable selection of evaluation factors is crucial for evaluating environmental geohazard probability. This paper proposes a screening method for environmental geohazard probability factors based on a random forest (RF) model. The accuracy and reasonableness of the RF model are verified by comparison with those of the GeoDetector (GD) model with a confusion matrix, cross-validation and receiver operating characteristic (ROC) curves. In addition, the effectiveness of the RF model was analyzed in terms of the results of environmental geohazard probability zoning using information volume (IV), the frequency ratio (FR), and the mean absolute error (MAE). The results are shown for Panshi city, Jilin Province, China. The RF model screened nine factors, such as the normalized difference vegetation index (NDVI), elevation, population density, land use type, distance from river, aspect, topography, rainfall intensity and rock type, among which NDVI, elevation and population density were the key factors in the study area. The three factors of slope, profile curvature, and distance from fault eliminated by the RF model are more relevant to the key factors in the study area. Rainfall intensity is an important inducer of environmental geohazards in the study area, and it is unreasonable for the GD model to eliminate it; moreover, it is more reasonable for the RF model to screen the factor. Each evaluation indicator of the confusion matrix after RF model screening is improved and higher than that of the GD model, the model generalizability ability is stronger, and the RF model performance is better. The average accuracy of the model after RF model screening is improved by 13%, the area under the curve (AUC) value is improved by 12%, and the model accuracy is higher. After screening, the results of environmental geohazard probability zoning are more closely related to the distribution characteristics of key factors, and the density of disaster points in high-probability and very high-probability zones is increased, with the FR increasing by 0.68% and 10.56% respectively, which is conducive to targeted prevention and control of environmental geohazards. The information contribution rate (IN) after screening reached 93.57%, the error of the environmental geohazard probability zoning results was reduced, the accuracy was improved, the results were more reasonable and effective, and the results could provide more targeted suggestions for the prevention and control of environmental geohazards.
geosciences, multidisciplinary,meteorology & atmospheric sciences,astronomy & astrophysics,engineering, aerospace
What problem does this paper attempt to address?