Abstract:Fine-tuning is arguably the most straightforward way to tailor a pre-trained model (e.g., a foundation model) to downstream applications, but it also comes with the risk of losing valuable knowledge the model had learned in pre-training. For example, fine-tuning a pre-trained classifier capable of recognizing a large number of classes to master a subset of classes at hand is shown to drastically degrade the model's accuracy in the other classes it had previously learned. As such, it is hard to further use the fine-tuned model when it encounters classes beyond the fine-tuning data. In this paper, we systematically dissect the issue, aiming to answer the fundamental question, "What has been damaged in the fine-tuned model?" To our surprise, we find that the fine-tuned model neither forgets the relationship among the other classes nor degrades the features to recognize these classes. Instead, the fine-tuned model often produces more discriminative features for these other classes, even if they were missing during fine-tuning! {What really hurts the accuracy is the discrepant logit scales between the fine-tuning classes and the other classes}, implying that a simple post-processing calibration would bring back the pre-trained model's capability and at the same time unveil the feature improvement over all classes. We conduct an extensive empirical study to demonstrate the robustness of our findings and provide preliminary explanations underlying them, suggesting new directions for future theoretical analysis. Our code is available at <a class="link-external link-https" href="https://github.com/OSU-MLB/Fine-Tuning-Is-Fine-If-Calibrated" rel="external noopener nofollow">this https URL</a>.

Exploring Variability in Fine-Tuned Models for Text Classification with DistilBERT

Investigating Learning Dynamics of BERT Fine-Tuning

A Closer Look at How Fine-tuning Changes BERT

Fine-Tuning is Fine, if Calibrated

A Stability Analysis of Fine-Tuning a Pre-Trained Model

Empirical Study of LLM Fine-Tuning for Text Classification in Legal Document Review

Empirical Analysis of Efficient Fine-Tuning Methods for Large Pre-Trained Language Models

Fine-tuning large neural language models for biomedical natural language processing

Visualizing and Understanding the Effectiveness of BERT.

Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation

How to Fine-Tune BERT for Text Classification?

Towards Efficient Fine-tuning of Pre-trained Code Models: an Experimental Study and Beyond

Optimizing Performance: How Compact Models Match or Exceed GPT's Classification Capabilities through Fine-Tuning

Improving Fine-tuning Pre-trained Models on Small Source Code Datasets Via Variational Information Bottleneck.

Parameter-efficient fine-tuning of large-scale pre-trained language models

Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation

Efficient Fine-Tuning of Compressed Language Models with Learners

Fine-tuning Happens in Tiny Subspaces: Exploring Intrinsic Task-specific Subspaces of Pre-trained Language Models

Measuring the Instability of Fine-Tuning

Energy and Carbon Considerations of Fine-Tuning BERT

Rethinking the Hyperparameters for Fine-tuning