Abstract:General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language processing tasks through the way of fine-tuning in the downstream tasks. But it still lacks of task-specific knowledge and domain-related knowledge for further improving the performance of BERT model and more detailed fine-tuning strategy analyses are necessary. To address these problem, a BERT-based text classification model BERT4TC is proposed via constructing auxiliary sentence to turn the classification task into a binary sentence-pair one, aiming to address the limited training data problem and task-awareness problem. The architecture and implementation details of BERT4TC are also presented, as well as a post-training approach for addressing the domain challenge of BERT. Finally, extensive experiments are conducted on seven public widely-studied datasets for analyzing the fine-tuning strategies from the perspectives of learning rate, sequence length and hidden state vector selection. After that, BERT4TC models with different auxiliary sentences and post-training objectives are compared and analyzed in depth. The experiment results show that BERT4TC with suitable auxiliary sentence significantly outperforms both typical feature-based methods and fine-tuning methods, and achieves new state-of-the-art performance on multi-class classification datasets. For binary sentiment classification datasets, our BERT4TC post-trained with suitable domain-related corpus also achieves better results compared with original BERT model.

How to Fine-Tune BERT for Text Classification?

Single task fine-tune BERT for text classification

Improving BERT-Based Text Classification With Auxiliary Sentence and Domain Knowledge

Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation

Can Fine-tuning Pre-trained Models Lead to Perfect NLP? A Study of the Generalizability of Relation Extraction.

A Closer Look at How Fine-tuning Changes BERT

A fine-tuning approach research of pre-trained model with two stage

Research on Text Classification Based on BERT-BiGRU Model

Patent classification by fine-tuning BERT language model

Improving BERT Fine-tuning with Embedding Normalization

RoBERTa-wwm-ext Fine-Tuning for Chinese Text Classification

PatentBERT: Patent Classification with Fine-Tuning a pre-trained BERT Model

Sensi-BERT: Towards Sensitivity Driven Fine-Tuning for Parameter-Efficient BERT

Global Semantic Information Extraction Model for Chinese long text classification based on fine-tune BERT

On Robustness and Bias Analysis of BERT-Based Relation Extraction

BERTer: The Efficient One

Improved Visual Fine-tuning with Natural Language Supervision

Bi-tuning of Pre-trained Representations

Bi-tuning: Efficient Transfer from Pre-trained Models

Fine-Tuning Large Language Models for Scientific Text Classification: A Comparative Study

TCBERT: A Technical Report for Chinese Topic Classification BERT