Lexicon Infused Phrase Embeddings for Named Entity Resolution

Alexandre Passos,Vineet Kumar,Andrew McCallum
DOI: https://doi.org/10.48550/arXiv.1404.5367
2014-04-22
Computation and Language
Abstract:Most state-of-the-art approaches for named-entity recognition (NER) use semi supervised information in the form of word clusters and lexicons. Recently neural network-based language models have been explored, as they as a byproduct generate highly informative vector representations for words, known as word embeddings. In this paper we present two contributions: a new form of learning word embeddings that can leverage information from relevant lexicons to improve the representations, and the first system to use neural word embeddings to achieve state-of-the-art results on named-entity recognition in both CoNLL and Ontonotes NER. Our system achieves an F1 score of 90.90 on the test set for CoNLL 2003---significantly better than any previous system trained on public data, and matching a system employing massive private industrial query-log data.
What problem does this paper attempt to address?