Abstract:Since the introduction of Graph Neural Networks (GNNs), molecular graphs have become useful tools in chemical informatics. However, in property prediction tasks, graph embeddings often still resemble traditional fingerprints. Here, we propose a straightforward approach to provide modern GNNs with raw quantum-chemical data, enabling efficient solutions to a range of chemical machine-learning problems. The central role is played by the 1-electron density matrix derived from quantum chemical calculations (e.g. Hartree-Fock, DFT). The diagonal blocks of the density matrix are used as embeddings for the atomic nodes (“atoms”) in the molecular graph. Unlike conventional molecular graph representations, the chemical bond concept is not used. Instead, an additional set of nodes (“links”) between pairs of atoms is introduced. Their embeddings are the off-diagonal blocks of the density matrix, related to particular atom pairs. Directed graph edges connect either “atoms” with “links” or vice versa. The embeddings of the edges are derived from the basis set overlap matrix. The overlaps serve two purposes: first, they encode structural information such as distances and angles. Second, they act as weights in pooling operations. The use of element-wise multiplication of densities and overlaps is inspired by the Mulliken population analysis scheme. The proposed concept was further tested using the Solubility Challenge (2008) by Llinàs et al. (DOI: 10.1021/ci800058v). A GNN was trained on a small dataset comprising 94 aqueous solubilities of drug-like molecules and subsequently used to predict the aqueous solubilities of 28 test molecules. The model achieved an RM SE of 0.68 and an R 2 of 0.76, outperforming all methods proposed at that time. In our view, this represents a promising approach, particularly considering that even in a preliminary test the proposed architecture seems to be able to achieve state-of-the-art accuracy.

Gini in a Bottleneck: Sparse Molecular Representations for Graph Convolutional Neural Networks

Discovering the Representation Bottleneck of Graph Neural Networks from Multi-order Interactions

MMGNN: A Molecular Merged Graph Neural Network for Explainable Solvation Free Energy Prediction

On the Scalability of GNNs for Molecular Graphs

Neural Mulliken Analysis: Molecular Graphs from Density Matrices for QSPR on Raw Quantum-Chemical Data

Unveiling Molecular Moieties through Hierarchical Graph Explainability

Utilizing Edge Features in Graph Neural Networks Via Variational Information Maximization

HiGNN: Hierarchical Informative Graph Neural Networks for Molecular Property Prediction Equipped with Feature-Wise Attention

Interpreting Graph Neural Networks with Myerson Values for Cheminformatics Approaches

Molecular Hypergraph Neural Networks

Molecular Graph Generation via Geometric Scattering

Graph Neural Network-State Predictive Information Bottleneck (GNN-SPIB) approach for learning molecular thermodynamics and kinetics

Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective for Molecular Property Prediction

Graph neural processes for molecules: an evaluation on docking scores and strategies to improve generalization

Image-Like Graph Representations for Improved Molecular Property Prediction

Towards Interpretable Sparse Graph Representation Learning with Laplacian Pooling

Accelerating Molecular Graph Neural Networks via Knowledge Distillation

Graph-in-Graph (GiG): Learning interpretable latent graphs in non-Euclidean domain for biological and healthcare applications

Molecular Graph Representation Learning via Structural Similarity Information

Quantitative evaluation of explainable graph neural networks for molecular property prediction

On the Interplay of Subset Selection and Informed Graph Neural Networks