How to Make Use of Pretrained Models in Few-Shot Classification

Mingyu Fu,Peng Wang
DOI: https://doi.org/10.1117/12.2668153
2023-01-01
Abstract:Few-shot learning(FSL) aims to generalize model to novel categoeries by few labelled samples, which is challenging for machine. Large-scaled pretrained models, especially vision transformers achieve excellent performances benefiting from numerous and diverse data. Researchers have exploited pretrained models in few-shot classification by simply updating the whole parameters and finetuning on few samples. In this paper, we explore two methods: vision prompt tuning and a reparameterization method called ‘scaling&&shift’ to leverage pretrained models in few-shot classification. Vision prompt tuning is for vision transformer only and we first evaluate the method in few-shot setting. ‘Scaling&&shift’ is originally applied in convolution neural networks(CNN). We extend it to vision transformer. The two methods are evaluated on standard benchmarks such as miniImageNet, CUB, CIFAR-FS, clipart and sketch. The results show that ‘scaling&&shift’ reaches the same level compared to updating the whole parameters. Vision prompt tuning is 0%~5% lower than updating the whole parameters over five datasets while it has quite smaller amount of parameters updated.
What problem does this paper attempt to address?