Fine-Tuning Large Language Models for Scientific Text Classification: A Comparative Study

Zhyar Rzgar K Rostam,Gábor Kertész

2024-11-28

Abstract:The exponential growth of online textual content across diverse domains has necessitated advanced methods for automated text classification. Large Language Models (LLMs) based on transformer architectures have shown significant success in this area, particularly in natural language processing (NLP) tasks. However, general-purpose LLMs often struggle with domain-specific content, such as scientific texts, due to unique challenges like specialized vocabulary and imbalanced data. In this study, we fine-tune four state-of-the-art LLMs BERT, SciBERT, BioBERT, and BlueBERT on three datasets derived from the WoS-46985 dataset to evaluate their performance in scientific text classification. Our experiments reveal that domain-specific models, particularly SciBERT, consistently outperform general-purpose models in both abstract-based and keyword-based classification tasks. Additionally, we compare our achieved results with those reported in the literature for deep learning models, further highlighting the advantages of LLMs, especially when utilized in specific domains. The findings emphasize the importance of domain-specific adaptations for LLMs to enhance their effectiveness in specialized text classification tasks.

Computation and Language

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: With the exponential growth of online text content in various fields, the task of automated text classification has become increasingly important. However, general - purpose large - scale language models (LLMs) usually perform poorly when dealing with content in specific domains (such as scientific literature) due to challenges such as specialized vocabulary, unique grammatical structures, and unbalanced data distribution. Therefore, this paper aims to evaluate the performance of four state - of - the - art large - scale language models based on the Transformer architecture (BERT, SciBERT, BioBERT, and BlueBERT) in scientific text classification tasks through fine - tuning. Specifically, this research uses three datasets derived from the WoS - 46985 dataset for experiments to evaluate the performance of these models in scientific text classification tasks. The experimental results show that domain - specific models (especially SciBERT) always outperform general - purpose models in classification tasks based on abstracts and keywords. In addition, the research also compares the obtained results with those of deep - learning models reported in the literature, further emphasizing the advantages of large - scale language models in specific - domain applications, especially that after domain - specific optimization, their effectiveness in professional text classification tasks can be significantly improved. In summary, the focus of this research is to explore how to enhance the performance of large - scale language models in scientific text classification tasks through domain - specific adjustments, so as to better cope with the professionalism and complexity of scientific literature.

Fine-Tuning Large Language Models for Scientific Text Classification: A Comparative Study

A Fine-Tuned Large Language Model for Domain-Specific with Reinforcement Learning

[Synthesis, identification of artificial antigen of catalpol and preliminary study of immunogenicity].

Empirical Study of LLM Fine-Tuning for Text Classification in Legal Document Review

A systematic evaluation of large language models for biomedical natural language processing: benchmarks, baselines, and recommendations

Fine-tuning large neural language models for biomedical natural language processing

Fine-tuning large language models for domain adaptation: Exploration of training strategies, scaling, model merging and synergistic capabilities

Evaluating Large Language Models for Health-Related Text Classification Tasks with Public Social Media Data

From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain

SciBERT: A Pretrained Language Model for Scientific Text

Towards Efficient Large Language Models for Scientific Text: A Review

A Comprehensive Evaluation of Large Language Models on Benchmark Biomedical Text Processing Tasks

Advancing Single- and Multi-task Text Classification through Large Language Model Fine-tuning

Large Language Models to process, analyze, and synthesize biomedical texts – a scoping review

Biomedical Large Languages Models Seem not to be Superior to Generalist Models on Unseen Medical Data

Coombs-negative Autoimmune Hemolytic Anemia Followed by Anti-erythropoetin Receptor Antibody-associated Pure Red Cell Aplasia: A Case Report and Review of Literature.

A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports

Fine-Tuning Language Models for Scientific Writing Support

Is larger always better? Evaluating and prompting large language models for non-generative medical tasks