Abstract:With the ever-increasing complexity of large-scale pre-trained models coupled with a shortage of labeled data for downstream training, transfer learning has become the primary approach in many fields, including natural language processing, computer vision, and multi-modal learning. Despite recent progress, the fine-tuning process for large-scale pre-trained models in vision still mostly relies on trial and error. This work investigates the relationship between neural collapse (NC) and transfer learning for classification problems. NC is an intriguing while prevalent phenomenon that has been recently discovered in terms of the final-layer features and linear classifiers of trained neural networks. Specifically, during the terminal phase of training, NC implies that the variability of the features within each class diminishes to zero, while the means of features between classes are maximally and equally distanced. In this work, we examine the NC attributes of pre-trained models on both downstream and source data for transfer learning, and we find strong correlation between feature collapse and downstream performance. In particular, we discovered a systematic pattern that emerges when linear probing pre-trained models on downstream training data: the more feature collapse of pre-trained models on downstream training data, the higher the transfer accuracy. Additionally, we also studied the relationship between NC and transfer accuracy on the source data. Moreover, these findings allow us to develop a principled, parameter-efficient fine-tuning method that employs skip-connection to induce the last-layer feature collapse on downstream data. Our proposed fine-tuning methods deliver good performances while reducing fine-tuning parameters by at least 90% and mitigating overfitting in situations especially when the downstream data is scarce.

Limitations of Neural Collapse for Understanding Generalization in Deep Learning

Perturbation Analysis of Neural Collapse

An Unconstrained Layer-Peeled Perspective on Neural Collapse

Neural Collapse in the Intermediate Hidden Layers of Classification Neural Networks

Prevalence of Neural Collapse during the terminal phase of deep learning training

The Prevalence of Neural Collapse in Neural Multivariate Regression

Generalizing and Decoupling Neural Collapse Via Hyperspherical Uniformity Gap

Beyond Unconstrained Features: Neural Collapse for Shallow Neural Networks with General Data

Towards understanding neural collapse in supervised contrastive learning with the information bottleneck method

Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse

Can We Understand Plasticity Through Neural Collapse?

Neural Collapse versus Low-rank Bias: Is Deep Neural Collapse Really Optimal?

A Neural Collapse Perspective on Feature Evolution in Graph Neural Networks

The Exploration of Neural Collapse under Imbalanced Data

Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay

The Persistence of Neural Collapse Despite Low-Rank Bias: An Analytic Perspective Through Unconstrained Features

Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations

Linguistic Collapse: Neural Collapse in (Large) Language Models

On the Robustness of Neural Collapse and the Neural Collapse of Robustness

Deep Neural Collapse Is Provably Optimal for the Deep Unconstrained Features Model

Understanding and Improving Transfer Learning of Deep Models via Neural Collapse