Machine Learning Prediction of Hydration Free Energy with Physically Inspired Descriptors.

Zhan Zhang,Ding Peng,Lihong Liu,W. Fang,Lin Shen
DOI: https://doi.org/10.1021/acs.jpclett.2c03858
2023-02-13
Abstract:We present machine learning models for predicting experimental hydration free energies of molecules without any atom-, bond-, or geometry-specific input feature. Four types of physically inspired descriptors are adopted for predictions. The first type is composed of the total dipole moment, anisotropic polarizability, and vibrational analysis results of the solute molecule. The second and third types are derived from the electrostatic potential distribution of the solute. The last type includes the solvent accessible surface area and shape similarities. Several machine learning regression models are built on the basis of the FreeSolv database with ∼600 samples, showing a better performance in comparison with that of most traditional approaches and other prediction methods based on molecular fingerprints. In particular, the present descriptors are capable of predicting hydration free energies of new compounds with elements or fragments that are never seen in the training set. The importance of these descriptors, the impact of dissociation energies of specific covalent bonds, and the outliers with relatively large prediction errors are also discussed.
Chemistry,Materials Science,Medicine,Computer Science
What problem does this paper attempt to address?