Diagnosis of cadmium contamination in urban and suburban soils using visible-to-near-infrared spectroscopy
Yongsheng Hong,Yiyun Chen,Ruili Shen,Songchao Chen,Gang Xu,Hang Cheng,Long Guo,Zushuai Wei,Jian Yang,Yaolin Liu,Zhou Shi,Abdul M Mouazen,Abdul M. Mouazen
DOI: https://doi.org/10.1016/j.envpol.2021.118128
IF: 8.9
2021-12-01
Environmental Pollution
Abstract:Previous studies have mostly focused on using visible-to-near-infrared spectral technique to quantitatively estimate soil cadmium (Cd) content, whereas little attention has been paid to identifying soil Cd contamination from a perspective of spectral classification. Here, we developed a framework to compare the potential of two spectral transformations (i.e., raw reflectance and continuum removal [CR]), three optimization strategies (i.e., full-spectrum, Boruta feature selection, and synthetic minority over-sampling technique [SMOTE]), and three classification algorithms (i.e., partial least squares discriminant analysis, random forest [RF], and support vector machine) for diagnosing soil Cd contamination. A total of 536 soil samples were collected from urban and suburban areas located in Wuhan City, China. Specifically, Boruta and SMOTE strategies were aimed at selecting the most informative predictors and obtaining balanced training datasets, respectively. Results indicated that soils contaminated by Cd induced decrease in spectral reflectance magnitude. Classification models developed after Boruta and SMOTE strategies out-performed to those from full-spectrum. A diagnose model combining CR preprocessing, SMOTE strategy, and RF algorithm achieved the highest validation accuracy for soil Cd (Kappa = 0.74). This study provides a theoretical reference for rapid identification of and monitoring of soil Cd contamination in urban and suburban areas.
environmental sciences