Is a Common Phrase an Entity Mention or Not? Dual Representations for Domain-Specific Named Entity Recognition.

Jiangtao Zhang,Juanzi Li,Li,Yixin Cao,Lei Hou,Shuai Wang
DOI: https://doi.org/10.1007/978-3-319-91452-7_53
2018-01-01
Abstract:Named Entity Recognition (NER) for specific domains is critical for building and managing domain-specific knowledge bases, but conventional NER methods cannot be applied to specific domains effectively. We found that one of reasons is the problem of common-phraselike entity mention prevalent in many domains. That is, many common phrases frequently occurring in general corpora may or may not be treated as named entities in specific domains. Therefore, determining whether a common phrase is an entity mention or not is a challenge. To address this issue, we present a novel BLSTM based NER model tailored for specific domains by learning dual representations for each word. It learns not only general domain knowledge derived from an external large scale general corpus via a word embedding model, but also the specific domain knowledge by training a stacked deep neural network (SDNN) integrating the results of a low-cost pre-entity-linking process. Extensive experiments on a real-world dataset of movie comments demonstrate the superiority of our model over existing state-of-the-art methods.
What problem does this paper attempt to address?