Measuring the Intrinsic Dimension of Objective Landscapes

Chunyuan Li,Heerad Farkhoor,Rosanne Liu,Jason Yosinski

DOI: https://doi.org/10.48550/arXiv.1804.08838

2018-04-24

Abstract:Many recently trained neural networks employ large numbers of parameters to achieve good performance. One may intuitively use the number of parameters required as a rough gauge of the difficulty of a problem. But how accurate are such notions? How many parameters are really needed? In this paper we attempt to answer this question by training networks not in their native parameter space, but instead in a smaller, randomly oriented subspace. We slowly increase the dimension of this subspace, note at which dimension solutions first appear, and define this to be the intrinsic dimension of the objective landscape. The approach is simple to implement, computationally tractable, and produces several suggestive conclusions. Many problems have smaller intrinsic dimensions than one might suspect, and the intrinsic dimension for a given dataset varies little across a family of models with vastly different sizes. This latter result has the profound implication that once a parameter space is large enough to solve a problem, extra parameters serve directly to increase the dimensionality of the solution manifold. Intrinsic dimension allows some quantitative comparison of problem difficulty across supervised, reinforcement, and other types of learning where we conclude, for example, that solving the inverted pendulum problem is 100 times easier than classifying digits from MNIST, and playing Atari Pong from pixels is about as hard as classifying CIFAR-10. In addition to providing new cartography of the objective landscapes wandered by parameterized models, the method is a simple technique for constructively obtaining an upper bound on the minimum description length of a solution. A byproduct of this construction is a simple approach for compressing networks, in some cases by more than 100 times.

Machine Learning,Neural and Evolutionary Computing

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to evaluate the minimum number of parameters required by a neural network model when solving a specific task, namely the so - called "intrinsic dimension". Specifically, the author trains the neural network in a randomly generated low - dimensional subspace and gradually increases the dimension of this subspace until the lowest dimension that can solve the problem is found, thereby defining the intrinsic dimension of the problem. This method not only helps to understand the difficulty of different tasks, but also provides a method for quantitatively comparing the difficulty of different types of learning tasks (such as supervised learning, reinforcement learning, etc.). The main contributions of the paper include: 1. **Proposing a new method for measuring the intrinsic dimension of a neural network**: By training the network in a random subspace, gradually increasing the dimension of the subspace, and finding the dimension at which a solution first appears, which is defined as the intrinsic dimension of the problem. 2. **Revealing the relationship between the intrinsic dimension and the number of model parameters**: The study found that for a given dataset, the intrinsic dimension of models of different sizes does not change much, which means that once the parameter space is large enough to solve the problem, the additional parameters mainly increase the dimension of the solution space. 3. **Providing a quantitative comparison of the difficulty of different tasks**: For example, solving the inverted pendulum problem is 100 times easier than classifying MNIST digits, and the difficulty of playing Atari Pong from pixels is comparable to classifying CIFAR - 10 images. 4. **Proposing a simple network compression method**: By training the network in a low - dimensional subspace, the number of model parameters can be significantly reduced, thereby achieving efficient model compression. These findings not only help to understand the optimization process of neural networks, but also provide a new perspective for model design and compression.

Measuring the Intrinsic Dimension of Objective Landscapes

Intrinsic dimensionality explains the effectiveness of language model fine-tuning

A Geometric Modeling of Occam's Razor in Deep Learning

Low-dimensional Intrinsic Dimension Reveals a Phase Transition in Gradient-Based Learning of Deep Neural Networks

Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

Train Deep Neural Networks in 40-D Subspaces

The Geometric Occam's Razor Implicit in Deep Learning

Geometry-induced Implicit Regularization in Deep ReLU Neural Networks

On Functional Dimension and Persistent Pseudodimension

Exploring Low-Dimensional Manifolds of Deep Neural Network Parameters for Improved Model Optimization

What does a deep neural network confidently perceive? The effective dimension of high certainty class manifolds and their low confidence boundaries

How many degrees of freedom do we need to train deep networks: a loss landscape perspective

Numerical Approximation Capacity of Neural Networks with Bounded Parameters: Do Limits Exist, and How Can They Be Measured?

High-performing neural network models of visual cortex benefit from high latent dimensionality

Measuring and regularizing networks in function space

Complexity Matters: Effective Dimensionality as a Measure for Adversarial Robustness

Beyond the noise: intrinsic dimension estimation with optimal neighbourhood identification

More or fewer latent variables in the high-dimensional data space? That is the question

Autoencoders with Intrinsic Dimension Constraints for Learning Low Dimensional Image Representations

Exploring Structural Sparsity of Deep Networks Via Inverse Scale Spaces

Unveiling and Mitigating Generalized Biases of DNNs through the Intrinsic Dimensions of Perceptual Manifolds