Revealing the Impact of Aggregations in the Graph-based Molecular Machine Learning: Electrostatic Interaction versus Pooling Methods

Sanghoon Lee,Hyun Woo Kim
DOI: https://doi.org/10.26434/chemrxiv-2024-sbwxm
2024-11-29
Abstract:Molecular structures that can be readily represented by graphs comprising constituent atoms (nodes) and their chemical bonds (edges) can also be used as input data for well-known machine learning (ML) models that process this data, such as graph neural networks (GNNs). GNNs showed a reasonable performance in the predicting properties of chemical systems. In typical applications of GNNs to chemistry-related fields, the main objective is to create an optimal molecular representation by aggregating atomic features and pooling features in the graph. In this study, we investigated two different approaches that can possibly generate better molecular representations. First, we created intermolecular edges to predict the photochemical properties of chromophore molecules in the solution. These intermolecular edges were constructed using atomic partial charges, inspired from the fact that electrostatic interaction is the main component of solute-solvent interaction. In the second approach, we investigated the effect of the aggregation and pooling functions. The results showed that intermolecular electrostatic interactions based on ground state charges prevent the GNN model from generating more effective molecular representations. On the contrary, the model demonstrated better performance when the averaging and adding operations were employed in a hybrid manner for aggregation and pooling functions.
Chemistry
What problem does this paper attempt to address?