Expanded ensemble predictions of toluene--water partition coefficients in the SAMPL9 LogP challenge
Vincent Voelz,Steven Goold,Robert M. Raddi
DOI: https://doi.org/10.26434/chemrxiv-2024-rfkkp
2024-09-21
Abstract:The logarithm of the partition coefficient (logP) between water and a nonpolar solvent is useful for characterizing a small molecule's hydrophobicity. For example, the water-octanol logP is often used as a predictor of a drug’s lipophilicity and/or membrane permeability, good indicators of its bioavailability. Existing computational predictors of water-octanol logP are generally very accurate due to the wealth of experimental measurements, but may be less so for other non-polar solvents such as toluene. In this work, we participate in a Statistical Assessment of the Modeling of Proteins and Ligands (SAMPL) logP challenge to examine the accuracy of a molecular simulation-based absolute free energy approach to predict water-toluene logP in a blind test for sixteen drug-like compounds with acid-base properties. Our simulation workflow used the OpenFF 2.0.0 force field, and an expanded ensemble (EE) method for free energy estimation, which enables efficient parallelization over multiple distributed computing clients for enhanced sampling. The EE method uses Wang-Landau flat-histogram sampling to estimate the free energy of decoupling in each solvent, and can be performed in a single simulation. Our protocol also includes a step to optimize the schedule of alchemical intermediates in each decoupling. The results show that our EE workflow is able to accurately predict free energies of transfer, achieving an RMSD of 2.26 kcal/mol, and $R^2$ of 0.80. An examination of outliers suggests that improved force field parameters could achieve better accuracy. Overall, our results suggest that expanded ensemble free energy calculations provide accurate first-principles logP prediction.
Chemistry
What problem does this paper attempt to address?
The main objective of this paper is to predict the partition coefficients (logP) of 16 small molecule drugs with acidic and basic properties between water and toluene using molecular simulation methods in the SAMPL9 LogP challenge. The study employed an Expanded Ensemble (EE) free energy method and utilized the OpenFF 2.0.0 force field for simulations. This method allows for efficient estimation of free energy in a single simulation process and is suitable for application in distributed computing environments.
Specifically, the researchers conducted their work through the following steps:
1. **System Preparation**: Using Openeye tools to convert the SMILES strings provided by SAMPL9 into 3D chemical structures and constructing molecular topologies.
2. **Simulation Execution**: Conducting extensive EE simulation experiments on the Folding@home platform and performing analyses on the high-performance computing cluster at Temple University.
3. **Data Analysis**: Performing statistical analysis on the simulation results to obtain estimates of the transfer free energy.
The final results show that this method can accurately predict the partition coefficients between water and toluene, with a root mean square deviation (RMSD) of 2.26 kcal/mol and a coefficient of determination (R²) of 0.80. Additionally, the study found that for certain specific molecules (such as fluoxetine, quinine, and trazodone), there were significant deviations between the predicted results and experimental values, which may be due to inaccuracies in the force field parameters when handling tertiary amines. Overall, this study demonstrates that expanded ensemble free energy calculations can be an effective method for predicting logP from first principles.