Abstract:Prevalent solution for BioNER involves using representation learning techniques coupled with sequence labeling. However, such methods are inherently task-specific, demonstrate poor generalizability, and often require dedicated model for each dataset. To leverage the versatile capabilities of recently remarkable large language models (LLMs), several endeavors have explored generative approaches to entity extraction. Yet, these approaches often fall short of the effectiveness of previouly sequence labeling approaches. In this paper, we utilize the open-sourced LLM LLaMA2 as the backbone model, and design specific instructions to distinguish between different types of entities and datasets. By combining the LLM's understanding of instructions with sequence labeling techniques, we use mix of datasets to train a model capable of extracting various types of entities. Given that the backbone LLMs lacks specialized medical knowledge, we also integrate external entity knowledge bases and employ instruction tuning to compel the model to densely recognize carefully curated entities. Our model VANER, trained with a small partition of parameters, significantly outperforms previous LLMs-based models and, for the first time, as a model based on LLM, surpasses the majority of conventional state-of-the-art BioNER systems, achieving the highest F1 scores across three datasets.

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to address several key issues in the task of Biomedical Named Entity Recognition (BioNER): 1. **Task Specificity and Generalization Ability**: - Current BioNER methods mostly rely on representation learning techniques and sequence labeling, which are usually designed for specific tasks and lack good generalization ability. Each dataset often requires a dedicated model, leading to model redundancy and resource waste. 2. **Effectiveness of Generative Models**: - Although some studies attempt to use generative pre-trained models (such as GPT) for entity extraction, these methods usually do not perform as well as traditional sequence labeling methods. 3. **Lack of Domain Knowledge**: - Large language models (LLMs), although performing well in natural language processing tasks, lack expertise in the biomedical field. This limits their performance in BioNER tasks. 4. **Challenges of Multi-Dataset Training**: - There are multiple datasets in the biomedical field with inconsistent annotation standards. Directly concatenating these datasets leads to annotation inconsistencies, affecting model performance. ### Solutions To address the above issues, the paper proposes the VANER model, which has the following main features: 1. **Utilizing Large Language Models (LLMs)**: - Using the open-source LLaMA2 as the backbone model and designing specific instructions to distinguish different types of entities and datasets. By combining the understanding ability of LLMs with sequence labeling techniques, the model can be trained on various datasets to extract different types of entities. 2. **Dense Biomedical Entity Recognition (DBR)**: - To compensate for the lack of knowledge in the biomedical field by LLMs, an external entity knowledge base (such as UMLS) is introduced, and instruction tuning is used to enable the model to densely recognize well-curated entities. This approach not only enhances the model's knowledge understanding ability but also improves the model's convergence speed and performance. 3. **Multi-Dataset Instruction Tuning**: - By performing instruction tuning on multiple biomedical NER datasets, the model can better adapt to different annotation standards, thereby improving overall performance. 4. **Resource Efficiency**: - This method requires only a single 4090 GPU for training and inference, making it highly resource-efficient. ### Experimental Results - **Performance Improvement**: VANER achieves state-of-the-art performance on multiple datasets, particularly excelling on the BC4CHEMD, BC5CDR-chem, and Linnaeus datasets. - **Domain Adaptability**: VANER demonstrates strong domain adaptability, performing well on the unseen CRAFT dataset. - **Resource Efficiency**: Compared to traditional methods, VANER is more resource-efficient, requiring only a single 4090 GPU for training and inference. ### Summary By combining the versatility of large language models with sequence labeling techniques, VANER effectively addresses the challenges of task specificity, generalization ability, lack of domain knowledge, and multi-dataset training in BioNER tasks, significantly improving model performance and resource efficiency.

VANER: Leveraging Large Language Model for Versatile and Adaptive Biomedical Named Entity Recognition

Advancing entity recognition in biomedicine via instruction tuning of large language models

Inspire the Large Language Model by External Knowledge on BioMedical Named Entity Recognition

LLMs in Biomedicine: A study on clinical Named Entity Recognition

A Knowledge-Enhanced Medical Named Entity Recognition Method that Integrates Pre-Trained Language Models

Advantage of gH-difference on the second-order fuzzy linear differential equations with constant coefficients

Incorporating Large Language Models into Named Entity Recognition: Opportunities and Challenges

Large Language Models Struggle in Token-Level Clinical Named Entity Recognition

NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data

A novel high voltage gain DC-DC converter with reduced components voltage stress

BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition

Online biomedical named entities recognition by data and knowledge-driven model

Comparative Analysis of Large Language Models in Chinese Medical Named Entity Recognition

GEIC: Universal and Multilingual Named Entity Recognition with Large Language Models

Disambiguation Model for Bio-Medical Named Entity Recognition

Effective Use of Bidirectional Language Modeling for Transfer Learning in Biomedical Named Entity Recognition

BoostER: Leveraging Large Language Models for Enhancing Entity Resolution

Research on Chinese medical named entity recognition based on collaborative cooperation of multiple neural network models

Long short-term memory RNN for biomedical named entity recognition

Evaluating Medical Entity Recognition in Health Care: Entity Model Quantitative Study