Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

Michael M. Bronstein,Joan Bruna,Taco Cohen,Petar Veličković
2021-05-03
Abstract:The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods. Indeed, many high-dimensional learning tasks previously thought to be beyond reach -- such as computer vision, playing Go, or protein folding -- are in fact feasible with appropriate computational scale. Remarkably, the essence of deep learning is built from two simple algorithmic principles: first, the notion of representation or feature learning, whereby adapted, often hierarchical, features capture the appropriate notion of regularity for each task, and second, learning by local gradient-descent type methods, typically implemented as backpropagation. While learning generic functions in high dimensions is a cursed estimation problem, most tasks of interest are not generic, and come with essential pre-defined regularities arising from the underlying low-dimensionality and structure of the physical world. This text is concerned with exposing these regularities through unified geometric principles that can be applied throughout a wide spectrum of applications. Such a 'geometric unification' endeavour, in the spirit of Felix Klein's Erlangen Program, serves a dual purpose: on one hand, it provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers. On the other hand, it gives a constructive procedure to incorporate prior physical knowledge into neural architectures and provide principled way to build future architectures yet to be invented.
Machine Learning,Artificial Intelligence,Computational Geometry,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily explores the core concepts and principles of Geometric Deep Learning, aiming to address how to incorporate geometric structures and symmetries into deep learning models to overcome the "curse of dimensionality" in high-dimensional data. Specifically, the paper first discusses the challenges of learning in high-dimensional spaces, known as the "curse of dimensionality," and points out that most practical problems have certain structural and regular patterns that can be leveraged to address this challenge through their inherent geometric properties. Subsequently, the paper introduces how to construct neural network architectures with specific inductive biases by using geometric priors, including principles such as symmetry and scale separation. The main goal of the paper is to demonstrate how to design deep learning models that can effectively handle different types of geometric structured data (such as grids, graphs, manifolds, etc.) starting from fundamental mathematical and geometric principles. It emphasizes the importance of this unified approach for understanding existing models and guiding future research. For example, Convolutional Neural Networks (CNNs) utilize translational and scale invariance in images through convolutional filters and pooling layers; similarly, Graph Neural Networks (GNNs) leverage the structural properties of graphs to process graph data. In summary, the problem the paper attempts to solve is how to systematically incorporate geometric structures and symmetries into the deep learning framework to establish a general theoretical foundation capable of efficiently handling various types of geometric data.