PTMA: Pre-trained Model Adaptation for Transfer Learning

Xiao Li,Junkai Yan,Jianjian Jiang,Wei-Shi Zheng
DOI: https://doi.org/10.1007/978-981-97-5492-2_14
2024-01-01
Abstract:The conventional two-stage transfer pipeline in computer vision initially pre-trains models on expansive upstream datasets and fine-tunes them using target downstream data. While straightforward and often effective, this approach encounters limited generalization when there is a substantial discrepancy between upstream and downstream datasets, which leads to the models' inherent confinement to upstream knowledge. In this work, we introduce an innovative transfer pipeline that incorporates a Pre-Trained Model Adaptation (PTMA) stage before the routine supervised fine-tuning, enhancing the traditional methodology. During the PTMA stage, we tailor pre-trained models to downstream datasets by employing a dual-strategy: preserving existing upstream knowledge while concurrently assimilating target-specific knowledge, facilitated through a partially trainable modeling approach. Models enhanced through PTMA possess a composite knowledge base, encompassing both upstream and downstream data, and consequently demonstrate superior transfer capabilities in the final stage of fine-tuning. Comprehensive experiments affirm that PTMA can be seamlessly integrated with typical pre-training and fine-tuning methods, significantly boosting the transfer performance.
What problem does this paper attempt to address?