From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain

Agnese Bonfigli,Luca Bacco,Mario Merone,Felice Dell’Orletta
DOI: https://doi.org/10.1016/j.artmed.2024.103003
IF: 7.011
2024-10-25
Artificial Intelligence in Medicine
Abstract:In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, a field that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models' downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models' downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain .
engineering, biomedical,computer science, artificial intelligence,medical informatics
What problem does this paper attempt to address?