Predicting soil available cadmium by machine learning based on soil properties

Jiawei Huang,Guangping Fan,Cun Liu,Dongmei Zhou
DOI: https://doi.org/10.1016/j.jhazmat.2023.132327
IF: 13.6
2023-01-01
Journal of Hazardous Materials
Abstract:Cadmium (Cd) accumulation in edible plant tissues poses a serious threat to human health through the food chain. Assessing the availability of soil Cd is crucial for evaluating associated environmental risks. However, existing experimental methods and traditional models are time-consuming and inefficient. In this study, we developed machine learning models to predict soil available Cd based on soil properties, using a dataset comprising 585 data points covering 585 soils. Traditional machine learning models exhibited prediction values beyond the theoretical range, urging the need for alternative approaches. To address this, different models were tested, and the post-constraint eXtreme Gradient Boosting (XGBoost) model was found to possess the best predictive performance (R2 =0.81) outperform traditional linear regression model in terms of accuracy. Furthermore, we explored the relationship between soil available Cd and wheat grain Cd and rice grain Cd. Linear regression models were developed using 302 data points for wheat and 563 data points for rice. Results demonstrated a significant correlation between soil available Cd and wheat grain Cd (R2 =0.487) as well as rice grain Cd (R2 =0.43).
What problem does this paper attempt to address?