A Hitchhiker's Guide to Geometric GNNs for 3D Atomic Systems

Alexandre Duval,Simon V. Mathis,Chaitanya K. Joshi,Victor Schmidt,Santiago Miret,Fragkiskos D. Malliaros,Taco Cohen,Pietro Liò,Yoshua Bengio,Michael Bronstein
2024-03-14
Abstract:Recent advances in computational modelling of atomic systems, spanning molecules, proteins, and materials, represent them as geometric graphs with atoms embedded as nodes in 3D Euclidean space. In these graphs, the geometric attributes transform according to the inherent physical symmetries of 3D atomic systems, including rotations and translations in Euclidean space, as well as node permutations. In recent years, Geometric Graph Neural Networks have emerged as the preferred machine learning architecture powering applications ranging from protein structure prediction to molecular simulations and material generation. Their specificity lies in the inductive biases they leverage - such as physical symmetries and chemical properties - to learn informative representations of these geometric graphs. In this opinionated paper, we provide a comprehensive and self-contained overview of the field of Geometric GNNs for 3D atomic systems. We cover fundamental background material and introduce a pedagogical taxonomy of Geometric GNN architectures: (1) invariant networks, (2) equivariant networks in Cartesian basis, (3) equivariant networks in spherical basis, and (4) unconstrained networks. Additionally, we outline key datasets and application areas and suggest future research directions. The objective of this work is to present a structured perspective on the field, making it accessible to newcomers and aiding practitioners in gaining an intuition for its mathematical abstractions.
Machine Learning,Artificial Intelligence,Quantitative Methods
What problem does this paper attempt to address?
The paper focuses on the application of Geometric Graph Neural Networks (GNNs) in 3D atomic systems, such as molecules, proteins, and materials. In the graph data processed by these networks, atoms are represented as nodes and have geometric properties in 3D Euclidean space, such as position, velocity, or force, which follow specific physical symmetries, including rotation and translation. The goal of the paper is to provide a comprehensive and self-contained overview of the field of geometric GNNs, including background, architectural classifications, and future research directions. It categorizes geometric GNN architectures into four types: invariant networks, equivariant networks with Cartesian bases, equivariant networks with spherical coordinate bases, and unconstrained networks. Additionally, the paper discusses key datasets and application scenarios, such as protein structure prediction, molecular simulation, and material generation. The paper also introduces an educational classification system to help readers understand different types of geometric GNNs and explores how to construct geometric graphs, how to leverage physical symmetries and chemical properties as inductive biases, and how to scale up geometric GNNs to handle larger problems. Lastly, the paper highlights key questions for future research, such as to what extent physical and symmetry properties should be embedded in geometric GNNs, how to construct geometric graphs, and how to scale up geometric GNNs.