GlobalNER: Incorporating Non-local Information into Named Entity Recognition

Chiao-Wei Hsu,Keh-Yih Su
2023-03-06
Abstract:Nowadays, many Natural Language Processing (NLP) tasks see the demand for incorporating knowledge external to the local information to further improve the performance. However, there is little related work on Named Entity Recognition (NER), which is one of the foundations of NLP. Specifically, no studies were conducted on the query generation and re-ranking for retrieving the related information for the purpose of improving NER. This work demonstrates the effectiveness of a DNN-based query generation method and a mention-aware re-ranking architecture based on BERTScore particularly for NER. In the end, a state-of-the-art performance of 61.56 micro-f1 score on WNUT17 dataset is achieved.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve The paper aims to address the issue of how to effectively utilize external information to improve performance in the Named Entity Recognition (NER) task. Specifically, the paper focuses on the following two main problems: 1. **Query Generation Problem**: - Current methods typically use the entire sentence as a query to retrieve related sentences, which may introduce a lot of irrelevant distracting information. For example, for a sentence containing "black widow" and "london fashion week," using the whole sentence as a query might retrieve content related to "dresses" and "party" rather than information directly related to the named entities. 2. **Re-ranking Problem**: - When using BERTScore to re-rank the retrieved sentences, BERTScore may not accurately reflect the actual usefulness of these sentences for the NER task. For instance, the retrieved sentences might contain words like "weather" and "view," which match some words in the original sentence but do not contain key information that helps identify "Empire State Building" as a location entity. ### Solutions To mitigate the above problems, the paper proposes the following methods: 1. **Using Only Named Entity Mentions for Query**: - It proposes using only the named entity mentions (NE-mention) in the sentence as query terms instead of using the entire sentence. This can avoid introducing irrelevant distracting information and retrieve more valuable reference sentences. 2. **Mention-based Re-ranker**: - A new re-ranker is proposed, which gives higher weight to named entity mentions during re-ranking. In this way, it can more accurately select reference sentences that are useful for the NER task. 3. **Unified Source-aware Framework**: - A unified framework is proposed that can retrieve non-local information from different sources (such as the internet, Wikipedia, and given corpora) and weight these reference sentences based on the reliability of the information. ### Experimental Results The paper conducted experiments on the WNUT17 dataset and successfully achieved a micro-average F1 score of 61.56, surpassing the current state-of-the-art methods. This indicates that the proposed methods have significant advantages in handling sentences that contain less context and require more external world knowledge.