Machine Learning for Evaluating the Cytotoxicity of Mixtures of Nano-TiO2 and Heavy Metals: QSAR Model Apply Random Forest Algorithm after Clustering Analysis

Leqi Sang,Yunlin Wang,Cheng Zong,Pengfei Wang,Huazhong Zhang,Dan Guo,Beilei Yuan,Yong Pan
DOI: https://doi.org/10.3390/molecules27186125
IF: 4.6
2022-09-20
Molecules
Abstract:With the development and application of nanomaterials, their impact on the environment and organisms has attracted attention. As a common nanomaterial, nano-titanium dioxide (nano-TiO2) has adsorption properties to heavy metals in the environment. Quantitative structure-activity relationship (QSAR) is often used to predict the cytotoxicity of a single substance. However, there is little research on the toxicity of interaction between nanomaterials and other substances. In this study, we exposed human renal cortex proximal tubule epithelial (HK-2) cells to mixtures of eight heavy metals with nano-TiO2, measured absorbance values by CCK-8, and calculated cell viability. PLS and two ensemble learning algorithms are used to build multiple QSAR models for data sets, and the test set R2 is increased from 0.38 to 0.78 and 0.85, and RMSE is decreased from 0.18 to 0.12 and 0.10. After selecting the better random forest algorithm, the K-means clustering algorithm is used to continue to optimize the model, increasing the test set R2 to 0.95 and decreasing the RMSE to 0.08 and 0.06. As a reliable machine algorithm, random forest can be used to predict the toxicity of the mixture of nano-metal oxides and heavy metals. The cluster analysis can effectively improve the stability and predictability of the model, and provide a new idea for the prediction of cytotoxicity model in the future.
chemistry, multidisciplinary,biochemistry & molecular biology
What problem does this paper attempt to address?