Classical Arabic Named Entity Recognition Using Variant Deep Neural Network Architectures and BERT

Norah Alsaaran,Maha Alrabiah
DOI: https://doi.org/10.1109/access.2021.3092261
IF: 3.9
2021-01-01
IEEE Access
Abstract:Recurrent Neural Networks (RNNs) and transformers are deep learning models that have achieved remarkable success in several Natural Language Processing (NLP) tasks since they do not rely on handcrafted features nor enormous knowledge resources. Named Entity Recognition (NER) is an essential NLP task that is used in many applications such as information retrieval, question answering, and machine translation. NER aims to locate, extract, and classify named entities into predefined categories such as person, organization and location. Arabic NER is considered a challenging task because of the complexity and the unique characteristics of Arabic. Most of the previous research on deep learning based-Arabic NER focused on Modern Standard Arabic and Dialectal Arabic, which are different variations from Classical Arabic. In this paper, we investigate deep learning-based Classical Arabic NER using different deep neural network architectures and a BERT based contextual language model that is trained on general domain Arabic text. We propose two RNN-based models by fine-tunning the pretrained BERT language model to recognize and classify named entities from Classical Arabic. The pre-trained BERT contextual language model representations were used as input features to a BGRU/BLSTM model and were fine-tuned using a Classical Arabic NER dataset. In addition, we explore variant architectures of the proposed BERT-BGRU/BLSTM-CRF models. Experimentations showed that the BERT-BGRU-CRF model outperformed the other models by achieving an F-measure of 94.76% on the CANERCorpus. To the best of our knowledge, this is the first work that aims to recognize named entities in Classical Arabic using deep learning.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?