Use of a large dataset to develop new models for estimating the sorption of active pharmaceutical ingredients in soils and sediments

Jun Li,John L Wilkinson,Alistair B A Boxall,John L. Wilkinson,Alistair B.A. Boxall
DOI: https://doi.org/10.1016/j.jhazmat.2021.125688
IF: 13.6
2021-08-01
Journal of Hazardous Materials
Abstract:<p>Information on the sorption of active pharmaceutical ingredients (APIs) in soils and sediments is needed for assessing the environmental risks of these substances yet these data are unavailable for many APIs in use. Predictive models for estimating sorption could provide a solution. The performance of existing models is, however, often poor and most models do not account for the effects of soil/sediment properties which are known to significantly affect API sorption. Therefore, here, we use a high-quality dataset on the sorption behavior of 54 APIs in 13 soils and sediments to develop new models for estimating sorption coefficients for APIs in soils and sediments using three machine learning approaches (artificial neural network, random forest and support vector machine) and linear regression. A random forest-based model, with chemical and solid descriptors as the input, was the best performing model. Evaluation of this model using an independent sorption dataset from the literature showed that the model was able to predict sorption coefficients of 90% of the test set to within a factor of 10 of the experimental values. This new model could be invaluable in assessing the sorption behavior of molecules that have yet to be tested and in landscape-level risk assessments.</p>
environmental sciences,engineering, environmental
What problem does this paper attempt to address?