Boosting Prompt-Based Few-Shot Learners Through Out-of-Domain Knowledge Distillation

Xiaoqing Chen,Chengyu Wang,Junwei Dong,Minghui Qiu,Liang Feng,Jun Huang
DOI: https://doi.org/10.1109/icassp49357.2023.10096045
2023-01-01
Abstract:Prompt-based learning improves the performance of Pre-trained Language Models (PLMs) over few-shot learning and is suitable for low-resourced scenarios. However, it is challenging to deploy large PLMs online. Knowledge Distillation (KD) can compress large PLMs into small ones; yet, few-shot KD for prompt-tuned PLMs is challenging due to the lack of training data and the capacity gap between teacher and student models. We propose Boost-Distiller, the first few-shot KD algorithm for prompt-tuned PLMs with the help of the out-of-domain data. Apart from distilling the model logits, Boost-Distiller specifically considers heuristically-generated fake logits that improve the generalization abilities of student models. We further leverage the cross-domain model logits, weighted with domain expertise scores that measure the transferablity of out-of-domain instances. Experiments over various datasets show Boost-Distiller consistently outperforms baselines by a large margin.
What problem does this paper attempt to address?