QSAR Models Using a Large Diverse Set of Estrogens

LM Shi,H Fang,WD Tong,J Wu,R Perkins,RM Blair,WS Branham,SL Dial,CI Moland,DM Sheehan
DOI: https://doi.org/10.1021/ci000066d
2001-01-01
Journal of Chemical Information and Computer Sciences
Abstract:Endocrine disruptors (EDs) have a variety of adverse effects in humans and animals. About 58,000 chemicals, most having little safety data, must be tested in a group of tiered assays. As assays will take years, it is important to develop rapid methods to help in priority setting. For application to large data sets, we have developed an integrated system that contains sequential four phases to predict the ability of chemicals to bind to the estrogen receptor (ER), a prevalent mechanism for estrogenic EDs. Here we report the results of evaluating two types of QSAR models for inclusion in phase III to quantitatively predict chemical binding to the ER. Our data set for the relative binding affinities (RBAs) to the ER consists of 130 chemicals covering a wide range of structural diversity and a 6 orders of magnitude spread of RBAs. CoMFA and HQSAR models were constructed and compared for performance. The CoMFA model had a r2 = 0.91 and a q2LOO = 0.66. HQSAR showed reduced performance compared to CoMFA with r2 = 0.76 and q2LOO = 0.59. A number of parameters were examined to improve the CoMFA model. Of these, a phenol indicator increased the q2LOO to 0.71. When up to 50% of the chemicals were left out in the leave-N-out cross-validation, the q2 remained significant. Finally, the models were tested by using two test sets; the q2pred for these were 0.71 and 0.62, a significant result which demonstrates the utility of the CoMFA model for predicting the RBAs of chemicals not included in the training set. If used in conjunction with phases I and II, which reduced the size of the data set dramatically by eliminating most inactive chemicals, the current CoMFA model (phase III) can be used to predict the RBA of chemicals with sufficient accuracy and to provide quantitative information for priority setting.
What problem does this paper attempt to address?