Boosting Arabic Named-Entity Recognition With Multi-Attention Layer.

Mohammed Nadher Abdo Ali,Guanzheng Tan,Aamir Hussain
DOI: https://doi.org/10.1109/ACCESS.2019.2909641
IF: 3.9
2019-01-01
IEEE Access
Abstract:Sequence labeling models with recurrent neural network variants, such as long short-term memory (LSTM) and gated recurrent unit (GRU), show promising performance on several natural language processing (NLP) problems, including named-entity recognition (NER). Most existing models utilize word embeddings for capturing similarities between words. However, they lag when handling previously unobserved or infrequent words. Moreover, the attention mechanism has been used to improve sequence labeling tasks. In this paper, we propose an efficient multi-attention layer system for the Arabic named-entity recognition (ANER) task. In addition to word-level embeddings, we adopt character-level embeddings and combine them via an embedding-level attention mechanism. The output is fed into an encoder unit with bidirectional-LSTM, followed by another self-attention layer that is used to boost the system performance. Our model achieves approximately matched F1 score of 91% on the "ANERCorpus." The overall experimental results demonstrate that our method is superior to other systems. Our approach using multi-layer attention mechanism yields a new state-of-the-art result for the ANER.
What problem does this paper attempt to address?