Findings of the Second Challenge to Predict Aqueous Solubility
Antonio Llinas,Ioana Oprisiu,Alex Avdeef
DOI: https://doi.org/10.1021/acs.jcim.0c00701
IF: 6.162
2020-08-14
Journal of Chemical Information and Modeling
Abstract:Ten years ago, we issued an open prediction challenge to the cheminformatics community: would participants be able to predict the equilibrium intrinsic solubilities of 32 druglike molecules using only a high-precision (CheqSol instrument, performed in one laboratory) set of 100 compounds as a training set? The "solubility challenge" was a widely recognized success and spurred many discussions about the prediction methods and quality of data. We revisited the competition a second time recently and challenged the community to a different challenge, not a blind test this time but using a larger test set of molecules, gathered and curated from published sources (mostly "gold standard" saturation shake-flask measurements), where the average interlaboratory reproducibility for the molecules was estimated to be ∼0.17 log unit. Also, a second test set was included, comprising "contentious" molecules, the reported (mostly shake-flask) solubility of which had higher average uncertainty, ∼0.62 log unit. In the second competition, the participants were invited to use their own training sets, provided that the training sets did not contain any of the test set molecules. We were motivated to revisit the competition to (1) examine to what extent computational methods had improved in 10 years, (2) verify that data quality may not be the main limiting factor in the accuracy of the prediction method, and (3) attempt to seek a relationship between the makeup of the training set data and the prediction outcome.The Supporting Information is available free of charge at <a class="ext-link" href="/doi/10.1021/acs.jcim.0c00701?goto=supporting-info">https://pubs.acs.org/doi/10.1021/acs.jcim.0c00701</a>.SC-2 data and results (<a class="ext-link" href="/doi/suppl/10.1021/acs.jcim.0c00701/suppl_file/ci0c00701_si_001.xlsx">XLSX</a>)This article has not yet been cited by other publications.
chemistry, multidisciplinary, medicinal,computer science, interdisciplinary applications, information systems
What problem does this paper attempt to address?