Combining Denoising Autoencoders with Contrastive Learning to fine-tune Transformer Models

Alejo Lopez-Avila,Víctor Suárez-Paniagua
2024-05-23
Abstract:Recently, using large pretrained Transformer models for transfer learning tasks has evolved to the point where they have become one of the flagship trends in the Natural Language Processing (NLP) community, giving rise to various outlooks such as prompt-based, adapters or combinations with unsupervised approaches, among many others. This work proposes a 3 Phase technique to adjust a base model for a classification task. First, we adapt the model's signal to the data distribution by performing further training with a Denoising Autoencoder (DAE). Second, we adjust the representation space of the output to the corresponding classes by clustering through a Contrastive Learning (CL) method. In addition, we introduce a new data augmentation approach for Supervised Contrastive Learning to correct the unbalanced datasets. Third, we apply fine-tuning to delimit the predefined categories. These different phases provide relevant and complementary knowledge to the model to learn the final task. We supply extensive experimental results on several datasets to demonstrate these claims. Moreover, we include an ablation study and compare the proposed method against other ways of combining these techniques.
Computation and Language
What problem does this paper attempt to address?
This paper proposes a three-stage fine-tuning method for pre-trained Transformer models to adapt to supervised classification tasks. Firstly, the model is further trained using Denoising Autoencoder (DAE) to fit the data distribution. Secondly, Contrastive Learning (CL) is utilized to adjust the output representation to correspond to the categories. Thirdly, fine-tuning is applied to constrain the predefined categories. Additionally, a new data augmentation method is introduced to correct imbalanced datasets. Experimental results demonstrate that this approach outperforms traditional fine-tuning methods. The paper also conducts ablation studies and compares the proposed method with other techniques. Overall, the paper aims to improve the performance of pre-trained Transformer models on NLP classification tasks by integrating self-supervised methods and fine-tuning.