Abstract:With the ever-increasing complexity of large-scale pre-trained models coupled with a shortage of labeled data for downstream training, transfer learning has become the primary approach in many fields, including natural language processing, computer vision, and multi-modal learning. Despite recent progress, the fine-tuning process for large-scale pre-trained models in vision still mostly relies on trial and error. This work investigates the relationship between neural collapse (NC) and transfer learning for classification problems. NC is an intriguing while prevalent phenomenon that has been recently discovered in terms of the final-layer features and linear classifiers of trained neural networks. Specifically, during the terminal phase of training, NC implies that the variability of the features within each class diminishes to zero, while the means of features between classes are maximally and equally distanced. In this work, we examine the NC attributes of pre-trained models on both downstream and source data for transfer learning, and we find strong correlation between feature collapse and downstream performance. In particular, we discovered a systematic pattern that emerges when linear probing pre-trained models on downstream training data: the more feature collapse of pre-trained models on downstream training data, the higher the transfer accuracy. Additionally, we also studied the relationship between NC and transfer accuracy on the source data. Moreover, these findings allow us to develop a principled, parameter-efficient fine-tuning method that employs skip-connection to induce the last-layer feature collapse on downstream data. Our proposed fine-tuning methods deliver good performances while reducing fine-tuning parameters by at least 90% and mitigating overfitting in situations especially when the downstream data is scarce.

Deep Neural Network Models Trained With A Fixed Random Classifier Transfer Better Across Domains

Understanding and Improving Transfer Learning of Deep Models via Neural Collapse

Inducing Neural Collapse in Imbalanced Learning: Do We Really Need a Learnable Classifier at the End of Deep Neural Network?

No Fear of Classifier Biases: Neural Collapse Inspired Federated Learning with Synthetic and Fixed Classifier

Transferable Normalization: Towards Improving Transferability of Deep Neural Networks

Neural Collapse in Deep Linear Networks: From Balanced to Imbalanced Data

Neural Collapse versus Low-rank Bias: Is Deep Neural Collapse Really Optimal?

Subdomain contraction in deep networks for robust representation learning

Deep Neural Collapse Is Provably Optimal for the Deep Unconstrained Features Model

All-around Neural Collapse for Imbalanced Classification

Maximal Domain Independent Representations Improve Transfer Learning

Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Feature Model

Unleashing the Power of Neural Collapse for Transferability Estimation

Generalizing and Decoupling Neural Collapse Via Hyperspherical Uniformity Gap

Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse

Quantifying the Variability Collapse of Neural Networks

Prevalence of Neural Collapse during the terminal phase of deep learning training

The Prevalence of Neural Collapse in Neural Multivariate Regression

Neural (Tangent Kernel) Collapse

Progressive Feedforward Collapse of ResNet Training

Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing Mistake Severity