A Named Entity Recognition Approach for Albanian Using Deep Learning

Evis Trandafili,Elinda Kajo Meçe,Enea Duka
DOI: https://doi.org/10.1007/978-3-030-36617-9_6
2020-01-01
Abstract:Named Trandafili, Evis Recognition (NER) is an Meçe, Elinda Kajo extraction task that deals with the identification and tagging of generic named entities and/or domain-specific named entities. NER is a crucial task in semantic Duka, Enea of text data, making it a key component in different Natural Language Processing applications such as Question Answering, Machine Translation, etc. In this paper we propose an approach for Named Entity Recognition based on Deep Learning models using an Albanian corpus. We focused on the generic named entities such as person’s name, geographical location, name of organization/institution and other categories. Given that there is no publicly available Albanian annotated corpus, we have manually created one. Furthermore, we have built a deep neural network using LSTM cells as the hidden layers and a Conditional Random Field as the output, using both word and character tagging. Taking into consideration the complexity of the Albanian language and the little research done in NLP for Albanian, the results achieved are promising. The results obtained from the experiments demonstrate that the NER performance can be further improved by using a larger annotated corpus to train the model.
What problem does this paper attempt to address?