Show Less, Instruct More: Enriching Prompts with Definitions and Guidelines for Zero-Shot NER

Andrew Zamai,Andrea Zugarini,Leonardo Rigutini,Marco Ernandes,Marco Maggini
2024-09-18
Abstract:Recently, several specialized instruction-tuned Large Language Models (LLMs) for Named Entity Recognition (NER) have emerged. Compared to traditional NER approaches, these models have demonstrated strong generalization capabilities. Existing LLMs primarily focus on addressing zero-shot NER on Out-of-Domain inputs, while fine-tuning on an extensive number of entity classes that often highly or completely overlap with test sets. In this work instead, we propose SLIMER, an approach designed to tackle never-seen-before entity tags by instructing the model on fewer examples, and by leveraging a prompt enriched with definition and guidelines. Experiments demonstrate that definition and guidelines yield better performance, faster and more robust learning, particularly when labelling unseen named entities. Furthermore, SLIMER performs comparably to state-of-the-art approaches in out-of-domain zero-shot NER, while being trained in a more fair, though certainly more challenging, setting.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively handle never - seen - before entity tags in zero - shot Named Entity Recognition (NER) tasks. Although existing large language models (LLMs) have shown strong generalization ability in zero - shot NER, these models are usually exposed to a large number of entity categories during training, and these categories are often highly overlapping or exactly the same as those in the test set. This setting leads to the model's insufficient ability to handle truly never - seen - before entity tags. To solve this problem, the paper proposes the SLIMER (Show Less, Instruct More - Entity Recognition) method. By reducing the number of training samples and using prompts that contain definitions and guidelines, it improves the model's ability to recognize never - seen - before entity tags. Specifically, the design of SLIMER aims to: 1. **Reduce training data**: By using fewer training samples, reduce the overlap of entity categories between the training set and the test set, thereby simulating a more realistic zero - shot scenario. 2. **Enhance prompt information**: Add entity definitions and annotation guidelines to the model's input prompt to guide the model to better understand and annotate never - seen - before entity categories. 3. **Improve generalization ability**: Through the above methods, make the model show stronger generalization ability and higher accuracy when dealing with never - seen - before entity tags. The paper verifies the effectiveness of SLIMER through experiments on standard zero - shot NER benchmark datasets (such as MIT and CrossNER) and datasets containing never - seen - before entity tags (such as BUSTER). The experimental results show that SLIMER outperforms many existing zero - shot NER methods when dealing with never - seen - before entity tags, especially when the entity categories in the dataset are significantly different from those in the training set.