CLLMFS: A Contrastive Learning enhanced Large Language Model Framework for Few-Shot Named Entity Recognition

Yafeng Zhang,Zilan Yu,Yuang Huang,Jing Tang
2024-08-23
Abstract:Few-shot Named Entity Recognition (NER), the task of identifying named entities with only a limited amount of labeled data, has gained increasing significance in natural language processing. While existing methodologies have shown some effectiveness, such as enriching label semantics through various prompting modes or employing metric learning techniques, their performance exhibits limited robustness across diverse domains due to the lack of rich knowledge in their pre-trained models. To address this issue, we propose CLLMFS, a Contrastive Learning enhanced Large Language Model (LLM) Framework for Few-Shot Named Entity Recognition, achieving promising results with limited training data. Considering the impact of LLM's internal representations on downstream tasks, CLLMFS integrates Low-Rank Adaptation (LoRA) and contrastive learning mechanisms specifically tailored for few-shot NER. By enhancing the model's internal representations, CLLMFS effectively improves both entity boundary awareness ability and entity recognition accuracy. Our method has achieved state-of-the-art performance improvements on F1-score ranging from 2.58\% to 97.74\% over existing best-performing methods across several recognized benchmarks. Furthermore, through cross-domain NER experiments conducted on multiple datasets, we have further validated the robust generalization capability of our method. Our code will be released in the near future.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
### The Problem Addressed by the Paper This paper aims to address the issue of **Few-Shot Named Entity Recognition (FS-NER)**. Specifically, the research team proposes a new framework—**CLLMFS** (Contrastive Learning enhanced Large Language Model Framework for Few-Shot Named Entity Recognition), which aims to perform efficient and accurate entity recognition using limited annotated data. #### Main Objectives: 1. **Enhance Model Robustness**: By integrating contrastive learning and Low-Rank Adaptation (LoRA), the model's generalization ability across different domains is improved. 2. **Optimize Boundary Awareness**: Improve internal representations to better identify entity boundaries. 3. **Increase Entity Recognition Accuracy**: Enhance the accuracy of entity extraction through the contrastive learning mechanism. 4. **Reduce Overfitting Risk**: Minimize the number of fine-tuning parameters using LoRA technology to avoid overfitting due to insufficient training data. 5. **Achieve Cross-Domain Applicability**: Validate the method's effectiveness on multiple datasets from different domains, demonstrating its superior performance in cross-domain tasks. #### Experimental Results: - The method achieved significant performance improvements on multiple benchmark datasets, with F1 scores increasing by 2.58% to 97.74%. - Cross-domain experiments further validated the robustness and generalization ability of the method. In summary, this paper proposes an innovative solution for the few-shot named entity recognition task, aiming to leverage the knowledge of large pre-trained language models to achieve efficient and accurate entity recognition under limited annotated data conditions.