Abstract:Quantitative structure-activity relationship (QSAR) models have long been used for making predictions and data gap filling in diverse fields including medicinal chemistry, predictive toxicology, environmental fate modeling, materials science, agricultural science, nanoscience, food science, and so forth. Usually a QSAR model is developed based on chemical information of a properly designed training set and corresponding experimental response data while the model is validated using one or more test set(s) for which the experimental response data are available. However, it is interesting to estimate the reliability of predictions when the model is applied to a completely new data set (true external set) even when the new data points are within applicability domain (AD) of the developed model. In the present study, we have categorized the quality of predictions for the test set or true external set into three groups (good, moderate, and bad) based on absolute prediction errors. Then, we have used three criteria [(a) mean absolute error of leave-one-out predictions for 10 most close training compounds for each query molecule; (b) AD in terms of similarity based on the standardization approach; and (c) proximity of the predicted value of the query compound to the mean training response] in different weighting schemes for making a composite score of predictions. It was found that using the most frequently appearing weighting scheme 0.5-0-0.5, the composite score-based categorization showed concordance with absolute prediction error-based categorization for more than 80% test data points while working with 5 different datasets with 15 models for each set derived in three different splitting techniques. These observations were also confirmed with true external sets for another four endpoints suggesting applicability of the scheme to judge the reliability of predictions for new datasets. The scheme has been implemented in a tool "Prediction Reliability Indicator" available at http://dtclab.webs.com/software-tools and http://teqip.jdvu.ac.in/QSAR_Tools/DTCLab/, and the tool is presently valid for multiple linear regression models only.

Applicability Domains for Classification Problems: Benchmarking of Distance to Models for Ames Mutagenicity Set.

Improvement of quantitative structure–activity relationship (QSAR) tools for predicting Ames mutagenicity: outcomes of the Ames/QSAR International Challenge Project

Rethinking the applicability domain analysis in QSAR models

In Silico Prediction of Chemical Ames Mutagenicity

The enhancement scheme for the predictive ability of QSAR: A case of mutagenicity

An In-Depth Re-Evaluation of Models from the 1st and 2nd Ames/QSAR International Challenge Projects

How Precise Are Our Quantitative Structure-Activity Relationship Derived Predictions for New Query Chemicals?

Validating ADME QSAR Models Using Marketed Drugs

QSAR Modeling: Where Have You Been? Where Are You Going To?

Chemical rules for optimization of chemical mutagenicity via matched molecular pairs analysis and machine learning methods

Why QSAR fails: an empirical evaluation using conventional computational approach.

The system of self-consistent models for pesticide toxicity to Daphnia magna

Quantum mechanical quantitative structure-activity relationships to avoid mutagenicity

Development and Evaluation of Conformal Prediction Methods for QSAR

Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets

A novel procedure for selection of molecular descriptors: QSAR model for mutagenicity of nitroaromatic compounds

Predictive QSAR Models for Polyspecific Drug Targets: the Importance of Feature Selection

Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process

Strategy proposal using QSAR models to approach mutagenicity assessment of non intentionally added substances in recycled plastic resins

Consensus QSAR models estimating acute toxicity to aquatic organisms from different trophic levels: algae, Daphnia and fish

One size does not fit all: revising traditional paradigms for QSAR-based virtual screenings.