Toxicity Classification of Oxide Nanomaterials: Effects of Data Gap Filling and PChem Score-based Screening Approaches

My Kieu Ha,Tung Xuan Trinh,Jang Sik Choi,Desy Maulina,Hyung Gi Byun,Tae Hyun Yoon

DOI: https://doi.org/10.1038/s41598-018-21431-9

IF: 4.6

2018-02-16

Scientific Reports

Abstract:Development of nanotoxicity prediction models is becoming increasingly important in the risk assessment of engineered nanomaterials. However, it has significant obstacles caused by the wide heterogeneities of published literature in terms of data completeness and quality. Here, we performed a meta-analysis of 216 published articles on oxide nanoparticles using 14 attributes of physicochemical, toxicological and quantum-mechanical properties. Particularly, to improve completeness and quality of the extracted dataset, we adapted two preprocessing approaches: data gap-filling and physicochemical property based scoring. Performances of nano-SAR classification models revealed that the dataset with the highest score value resulted in the best predictivity with compromise in its applicability domain. The combination of physicochemical and toxicological attributes was proved to be more relevant to toxicity classification than quantum-mechanical attributes. Overall, by adapting these two preprocessing methods, we demonstrated that meta-analysis of nanotoxicity literatures could provide an effective alternative for the risk assessment of engineered nanomaterials.

multidisciplinary sciences

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the extensive heterogeneity of data integrity and data quality encountered in the development of toxicity prediction models for nanomaterials (especially metal oxide nanoparticles). Specifically: 1. **Data integrity**: Data in published literature often have missing values, which affect the training and prediction capabilities of the model. For example, there may be a large number of blanks in data regarding the physicochemical properties, toxicological properties, and quantum - mechanical properties of nanoparticles. 2. **Data quality**: The lack of standardization of test protocols among different laboratories leads to uneven data quality. In addition, there are also problems with the reliability of data sources. Some data may come from manufacturers' specifications, and the accuracy of these data cannot be guaranteed. To address these problems, the authors adopted two pre - processing methods: - **Data imputation**: Fill in missing values by using manufacturers' specifications or other reference data to improve data integrity. - **Physicochemical property scoring**: Evaluate the quality of physicochemical data through a scoring framework and screen out high - quality data for model training. Through these methods, the authors hope to improve the quality and integrity of the data set, thereby improving the performance of the nanomaterial toxicity prediction model. Specifically, the authors' goals are: - Improve the prediction ability of the model, especially after the data quality and integrity are enhanced. - Determine which properties (physicochemical, toxicological, or quantum - mechanical properties) are most important for toxicity classification. - Analyze the impact of different pre - processing methods on model performance, especially the effects after data imputation and quality screening. - Define the Applicability Domain (AD) of the model to ensure the prediction reliability of the model on new data. Through these efforts, the authors hope to provide an effective alternative method for the risk assessment of nanomaterials.

Toxicity Classification of Oxide Nanomaterials: Effects of Data Gap Filling and PChem Score-based Screening Approaches

Predicting and investigating cytotoxicity of nanoparticles by translucent machine learning

Predicting toxic potencies of metal oxide nanoparticles by means of nano-QSARs

Predicting the Toxicities of Metal Oxide Nanoparticles Based on Support Vector Regression with a Residual Bootstrapping Method.

Machine learning-enabled nanosafety assessment of multi-metallic alloy nanoparticles modified TiO2 system

Evaluating metal oxide nanoparticle (MeOx NP) toxicity with different types of nano descriptors mainly focusing on simple periodic table-based descriptors: A mini-review

Nano(Q)SAR: Challenges, Pitfalls and Perspectives

Development of Generalized QSAR Models for Predicting Cytotoxicity and Genotoxicity of Metal Oxides Nanoparticles

Nano-read-across predictions of toxicity of metal oxide engineered nanoparticles (MeOx ENPS) used in nanopesticides to BEAS-2B and RAW 264.7 cells

Development of structure–activity relationship for metal oxide nanoparticles

Cytotoxicity prediction of nano metal oxides on different lung cells via Nano-QSAR

Application of Machine Learning in Nanotoxicology: A Critical Review and Perspective

Based on the Nano-QSAR model: Prediction of factors influencing damage to C. elegans caused by metal oxide nanomaterials and validation of toxic effects

Nano-QSAR Modeling for Predicting the Cytotoxicity of Metal Oxide Nanoparticles Using Novel Descriptors

Building species trait-specific nano-QSARs: Model stacking, navigating model uncertainties and limitations, and the effect of dataset size

Computer-aided Nanotoxicology: Risk Assessment of Metal Oxide Nanoparticles Via Nano-Qsar

Machine Learning Models for Predicting Cytotoxicity of Nanomaterials.

Quantitative Structure-activity Relationships; Studying the Toxicity of Metal Nanoparticles

Toxicity profiling of engineered nanomaterials via multivariate dose-response surface modeling

Using the Isalos platform to develop a (Q)SAR model that predicts metal oxide toxicity utilizing facet-based electronic, image analysis-based, and periodic table derived properties as descriptors

Applying quantitative structure–activity relationship approaches to nanotoxicology: Current status and future potential