Rapid Quantitative Determination of Chemical Oxygen Demand in Different Water Systems Based on Near-Infrared Spectroscopy Combined with Binary Grey Wolf Optimization and Competitive Adaptive Re-Weighting Sampling Feature Screening

Xueqin Han,Song Han,Jinfang Ma,Yongxin Zhou,Jiaze Chen,Danping Xie,Furong Huang
DOI: https://doi.org/10.2139/ssrn.3981248
2021-01-01
Abstract:To monitor environmental water pollution effectively and meet human water needs, it is crucial to develop a fast, simple, and accurate method for monitoring chemical oxygen demand (COD) in various water systems. In this study, COD prediction models for different water systems were developed by combining near-infrared (NIR) spectroscopy with partial least squares regression ( PLSR) . NIR spectra were obtained for 115 wastewater, 112 surface water, and 65 seawater samples. The spectra were preprocessed using three methods: multiple scattering correction (MSC), standard normal variate correction (SNV), and MSC plus SNV. Band optimization was performed by binary grey wolf optimization (BGWO) and competitive adaptive re-weighting sampling (CARS). The results obtained using these methods were thereafter combined with the PLSR algorithm to build COD prediction models for the different water systems and concurrently select the best band optimization method. The results showed that both BGWO–PLSR and CARS–PLSR models markedly improved model prediction performance while reducing the input data dimensionality of the prediction model in comparison with the PLSR model combined band ( 780–1894 nm , 2010–2446 nm ). The CARS–PLSR model yielded better prediction results than the BGWO–PLSR model. For wastewater, surface water, and seawater samples, t he predictive coefficient of determination ( R 2 P ) was increased from 0.826 to 0.860, 0.787 to 0.815, and 0.676 to 0.692 , t he prediction root mean square error ( RMSEP ) was reduced from 11.324 to 10.336, 16.075 to 14.977, and 0.295 to 0.288 mg/L; and the relative percent deviation of prediction set ( RPDP ) was increased from 2.398 to 2.671, 2.167 to 2.326, and 1.757 to 1.801, respectively. Therefore, the use of CARS to select the modeling spectral region improved the prediction accuracy of the model and allowed rapid quantitative detection of COD values in different water systems.
What problem does this paper attempt to address?