CSS-LM: A Contrastive Framework for Semi-Supervised Fine-Tuning of Pre-Trained Language Models

Yusheng Su,Xu Han,Yankai Lin,Zhengyan Zhang,Zhiyuan Liu,Peng Li,Jie Zhou,Maosong Sun
DOI: https://doi.org/10.1109/taslp.2021.3105013
2021-01-01
Abstract:Fine-tuning pre-trained language models (PLMs) has demonstrated its effectiveness on various downstream NLP tasks recently. However, in many scenarios with limited supervised data, the conventional fine-tuning strategies cannot sufficiently capture the important semantic features for downstream tasks. To address this issue, we introduce a novel framework (named "CSS-LM") to improve the fine-tuning phase of PLMs via contrastive semi-supervised learning. Specifically, given a specific task, we retrieve positive and negative instances from large-scale unlabeled corpora according to their domain-level and class-level semantic relatedness to the task. We then perform contrastive semi-supervised learning on both the retrieved unlabeled instances and original labeled instances to help PLMs capture crucial task-related semantic features. The experimental results show that CSS-LM achieves better results than the conventional fine-tuning strategy on a series of downstream tasks with few-shot settings by up to 7.8%, and outperforms the latest supervised contrastive fine-tuning strategy by up to 7.1%. Our datasets and source code will be available to provide more details.
What problem does this paper attempt to address?