Fine-Tuning Large Language Models for Scientific Text Classification: A Comparative Study

Zhyar Rzgar K Rostam,Gábor Kertész
2024-11-28
Abstract:The exponential growth of online textual content across diverse domains has necessitated advanced methods for automated text classification. Large Language Models (LLMs) based on transformer architectures have shown significant success in this area, particularly in natural language processing (NLP) tasks. However, general-purpose LLMs often struggle with domain-specific content, such as scientific texts, due to unique challenges like specialized vocabulary and imbalanced data. In this study, we fine-tune four state-of-the-art LLMs BERT, SciBERT, BioBERT, and BlueBERT on three datasets derived from the WoS-46985 dataset to evaluate their performance in scientific text classification. Our experiments reveal that domain-specific models, particularly SciBERT, consistently outperform general-purpose models in both abstract-based and keyword-based classification tasks. Additionally, we compare our achieved results with those reported in the literature for deep learning models, further highlighting the advantages of LLMs, especially when utilized in specific domains. The findings emphasize the importance of domain-specific adaptations for LLMs to enhance their effectiveness in specialized text classification tasks.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: With the exponential growth of online text content in various fields, the task of automated text classification has become increasingly important. However, general - purpose large - scale language models (LLMs) usually perform poorly when dealing with content in specific domains (such as scientific literature) due to challenges such as specialized vocabulary, unique grammatical structures, and unbalanced data distribution. Therefore, this paper aims to evaluate the performance of four state - of - the - art large - scale language models based on the Transformer architecture (BERT, SciBERT, BioBERT, and BlueBERT) in scientific text classification tasks through fine - tuning. Specifically, this research uses three datasets derived from the WoS - 46985 dataset for experiments to evaluate the performance of these models in scientific text classification tasks. The experimental results show that domain - specific models (especially SciBERT) always outperform general - purpose models in classification tasks based on abstracts and keywords. In addition, the research also compares the obtained results with those of deep - learning models reported in the literature, further emphasizing the advantages of large - scale language models in specific - domain applications, especially that after domain - specific optimization, their effectiveness in professional text classification tasks can be significantly improved. In summary, the focus of this research is to explore how to enhance the performance of large - scale language models in scientific text classification tasks through domain - specific adjustments, so as to better cope with the professionalism and complexity of scientific literature.