Intermolecular Non-Bonded Interactions from Machine Learning Datasets

Jia-An Chen,Sheng D. Chao
DOI: https://doi.org/10.3390/molecules28237900
IF: 4.6
2023-12-01
Molecules
Abstract:Accurate determination of intermolecular non-covalent-bonded or non-bonded interactions is the key to potentially useful molecular dynamics simulations of polymer systems. However, it is challenging to balance both the accuracy and computational cost in force field modelling. One of the main difficulties is properly representing the calculated energy data as a continuous force function. In this paper, we employ well-developed machine learning techniques to construct a general purpose intermolecular non-bonded interaction force field for organic polymers. The original ab initio dataset SOFG-31 was calculated by us and has been well documented, and here we use it as our training set. The CLIFF kernel type machine learning scheme is used for predicting the interaction energies of heterodimers selected from the SOFG-31 dataset. Our test results show that the overall errors are well below the chemical accuracy of about 1 kcal/mol, thus demonstrating the promising feasibility of machine learning techniques in force field modelling.
chemistry, multidisciplinary,biochemistry & molecular biology
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? The main goal of this paper is to utilize machine learning techniques to construct intermolecular non-bonded interaction force fields for organic polymers. Specifically, the authors attempt to address the following issues: 1. **Balancing accuracy and computational efficiency**: - Accurately determining intermolecular non-covalent interactions is crucial for molecular dynamics simulations of polymer systems. However, in force field modeling, maintaining both high accuracy and low computational cost is challenging. 2. **Energy data representation problem**: - Representing computed energy data as continuous force functions is a major challenge. Traditional methods often rely on selecting specific functional forms to fit discrete data, which can lead to weaker predictive capabilities. 3. **Effective utilization of training data**: - The authors use the SOGF-31 dataset, previously computed through quantum chemistry methods, as the training set and employ the CLIFF kernel-based machine learning scheme to predict the interaction energies of heterodimers in the SOGF-31 dataset. 4. **Validating the feasibility of machine learning in force field modeling**: - The test results show that the overall error is far below chemical accuracy (approximately 1 kcal/mol), demonstrating the potential feasibility of machine learning techniques in force field modeling. Through this research, the paper aims to showcase the potential of machine learning techniques in accurately describing intermolecular non-bonded interactions and to provide new methods for future research.