Few-shot Named Entity Recognition via Superposition Concept Discrimination

Jiawei Chen,Hongyu Lin,Xianpei Han,Yaojie Lu,Shanshan Jiang,Bin Dong,Le Sun
2024-03-25
Abstract:Few-shot NER aims to identify entities of target types with only limited number of illustrative instances. Unfortunately, few-shot NER is severely challenged by the intrinsic precise generalization problem, i.e., it is hard to accurately determine the desired target type due to the ambiguity stemming from information deficiency. In this paper, we propose Superposition Concept Discriminator (SuperCD), which resolves the above challenge via an active learning paradigm. Specifically, a concept extractor is first introduced to identify superposition concepts from illustrative instances, with each concept corresponding to a possible generalization boundary. Then a superposition instance retriever is applied to retrieve corresponding instances of these superposition concepts from large-scale text corpus. Finally, annotators are asked to annotate the retrieved instances and these annotated instances together with original illustrative instances are used to learn FS-NER models. To this end, we learn a universal concept extractor and superposition instance retriever using a large-scale openly available knowledge bases. Experiments show that SuperCD can effectively identify superposition concepts from illustrative instances, retrieve superposition instances from large-scale corpus, and significantly improve the few-shot NER performance with minimal additional efforts.
Computer Science
What problem does this paper attempt to address?
This paper focuses on the problem of precise generalization in few-shot Named Entity Recognition (NER). In few-shot NER tasks, the goal is to identify specific entity types in text with limited examples. However, due to ambiguity caused by insufficient information, it is difficult to accurately determine the target types, resulting in the problems of over-generalization or under-generalization. The paper proposes a method called "Superposition Concept Discriminator" (SuperCD) to address this issue through an active learning paradigm. Firstly, a concept extractor identifies superposition concepts from example instances, with each concept corresponding to possible generalization boundaries. Then, a superposition instance retriever retrieves instances of these superposition concepts from a large-scale text corpus. Finally, annotators label the retrieved instances, and the Few-shot NER model is trained by combining the annotated instances with the original example instances. SuperCD learns universal concept extractors and superposition instance retrievers on a large knowledge base, and demonstrates its effectiveness on five different few-shot NER benchmark datasets, significantly improving performance while reducing additional annotation efforts. In conclusion, this paper identifies the challenge of precise generalization in few-shot NER and proposes a new approach to enhance the model's generalization ability by identifying and exploiting superposition concepts.