Enhancing Clay Content Estimation Through Hybrid CatBoost-GP with Model Class Selection

Weihang Chen,Xing Wan,Jianwen Ding,Tengfei Wang
DOI: https://doi.org/10.1016/j.trgeo.2024.101232
IF: 4.938
2024-01-01
Transportation Geotechnics
Abstract:Excessive soil swelling, triggered by variations in moisture content, is recognized as one of the most severe geo-hazards. Swelling Potential (SP) is closely correlated with Clay Content (CC), Plasticity Index (PI), and Activity (A = PI/CC), which are crucial input features for SP prediction models. Traditional methods for determining CC, such as the hydrometer and pipette methods, are labor-intensive and time-consuming. Employing easily accessible soil properties to predict CC directly not only enhances the modeling efficiency of SP prediction models but also mitigates the impact of laboratory test errors on model performance. This study leveraged two databases encompassing a broad spectrum of soil classifications and developed three ensemble learning models (RF, XGBoost, and CatBoost) to predict CC efficiently. Sequential model-based optimization algorithms (SMBO) were employed to effectively identify the optimal hyperparameters for these models. To further refine the model, the Shapley additive explanation (SHAP) method was utilized to ascertain the most influential soil properties. The findings revealed that the CatBoost model, optimized via GP, achieved the highest level of predictive accuracy. Three easily measurable soil properties — the percentage of soil particles passing through 0.075 mm and 0.425 mm sieves, along with the plasticity index — were identified as key predictors for CC. The estimated CC values were instrumental in constructing various empirical and machine learning models to predict SP, all of which exhibited exceptional performance. This research advances the modeling efficiency of SP prediction models, making it particularly relevant for rapid-response scenarios in transportation infrastructure projects.
What problem does this paper attempt to address?