Deep Learning as Ricci Flow

Anthony Baptista,Alessandro Barp,Tapabrata Chakraborti,Chris Harbron,Ben D. MacArthur,Christopher R. S. Banerji
2024-04-22
Abstract:Deep neural networks (DNNs) are powerful tools for approximating the distribution of complex data. It is known that data passing through a trained DNN classifier undergoes a series of geometric and topological simplifications. While some progress has been made toward understanding these transformations in neural networks with smooth activation functions, an understanding in the more general setting of non-smooth activation functions, such as the rectified linear unit (ReLU), which tend to perform better, is required. Here we propose that the geometric transformations performed by DNNs during classification tasks have parallels to those expected under Hamilton's Ricci flow - a tool from differential geometry that evolves a manifold by smoothing its curvature, in order to identify its topology. To illustrate this idea, we present a computational framework to quantify the geometric changes that occur as data passes through successive layers of a DNN, and use this framework to motivate a notion of `global Ricci network flow' that can be used to assess a DNN's ability to disentangle complex data geometries to solve classification problems. By training more than $1,500$ DNN classifiers of different widths and depths on synthetic and real-world data, we show that the strength of global Ricci network flow-like behaviour correlates with accuracy for well-trained DNNs, independently of depth, width and data set. Our findings motivate the use of tools from differential and discrete geometry to the problem of explainability in deep learning.
Machine Learning,Differential Geometry
What problem does this paper attempt to address?
This paper discusses the problem of how deep neural networks (DNNs) simplify complex data through a series of geometric and topological transformations in classification tasks. Although there is some understanding of neural networks using smooth activation functions, there is not sufficient understanding of the working principles of more general non-smooth activation functions (such as ReLU), especially when they perform better in practice. The paper proposes that the geometric transformations performed by DNNs during classification have similarities to the Ricci flow in differential geometry, which is a tool for smoothing surface curvature to identify its topology. The authors construct a computational framework to quantify the geometric changes of data through each layer of the DNN and introduce the concept of "global Ricci network flow" to evaluate the ability of DNNs to analyze the geometric structures of complex data for solving classification problems. By training over 1500 DNNs with different widths and depths on synthetic and real-world data, they find a positive correlation between the strength of the global Ricci network flow behavior and accuracy, independent of depth, width, and dataset. The research also shows that although the topological simplification of data between DNN layers is not always monotonic, this "flow" behavior is related to the classification accuracy of DNNs. Finally, the paper suggests that tools from differential and discrete geometry can further explain the interpretability problem of deep learning.