Abstract:The recently discovered Neural Collapse (NC) phenomenon occurs pervasively in today's deep net training paradigm of driving cross-entropy (CE) loss towards zero. During NC, last-layer features collapse to their class-means, both classifiers and class-means collapse to the same Simplex Equiangular Tight Frame, and classifier behavior collapses to the nearest-class-mean decision rule. Recent works demonstrated that deep nets trained with mean squared error (MSE) loss perform comparably to those trained with CE. As a preliminary, we empirically establish that NC emerges in such MSE-trained deep nets as well through experiments on three canonical networks and five benchmark datasets. We provide, in a Google Colab notebook, PyTorch code for reproducing MSE-NC and CE-NC: at <a class="link-external link-https" href="https://colab.research.google.com/github/neuralcollapse/neuralcollapse/blob/main/neuralcollapse.ipynb" rel="external noopener nofollow">this https URL</a>. The analytically-tractable MSE loss offers more mathematical opportunities than the hard-to-analyze CE loss, inspiring us to leverage MSE loss towards the theoretical investigation of NC. We develop three main contributions: (I) We show a new decomposition of the MSE loss into (A) terms directly interpretable through the lens of NC and which assume the last-layer classifier is exactly the least-squares classifier; and (B) a term capturing the deviation from this least-squares classifier. (II) We exhibit experiments on canonical datasets and networks demonstrating that term-(B) is negligible during training. This motivates us to introduce a new theoretical construct: the central path, where the linear classifier stays MSE-optimal for feature activations throughout the dynamics. (III) By studying renormalized gradient flow along the central path, we derive exact dynamics that predict NC.

Inducing Neural Collapse in Imbalanced Learning: Do We Really Need a Learnable Classifier at the End of Deep Neural Network?

Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Feature Model

Neural Collapse in Deep Linear Networks: From Balanced to Imbalanced Data

Prevalence of Neural Collapse during the terminal phase of deep learning training

The Exploration of Neural Collapse under Imbalanced Data

All-around Neural Collapse for Imbalanced Classification

Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing Mistake Severity

Neural Collapse Terminus: A Unified Solution for Class Incremental Learning and Its Variants

Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class Incremental Learning

Towards understanding neural collapse in supervised contrastive learning with the information bottleneck method

No Fear of Classifier Biases: Neural Collapse Inspired Federated Learning with Synthetic and Fixed Classifier

Neural collapse inspired semi-supervised learning with fixed classifier

Leveraging Intermediate Neural Collapse with Simplex ETFs for Efficient Deep Neural Networks

Neural Collapse Under MSE Loss: Proximity to and Dynamics on the Central Path

Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse

Deep Neural Network Models Trained With A Fixed Random Classifier Transfer Better Across Domains

Neural Collapse Inspired Attraction-Repulsion-Balanced Loss for Imbalanced Learning

Neural Collapse in the Intermediate Hidden Layers of Classification Neural Networks

The Persistence of Neural Collapse Despite Low-Rank Bias: An Analytic Perspective Through Unconstrained Features

Beyond Unconstrained Features: Neural Collapse for Shallow Neural Networks with General Data

Deep Neural Collapse Is Provably Optimal for the Deep Unconstrained Features Model