Improving Feature Extraction Using a Hybrid of CNN and LSTM for Entity Identification
Elham Parsaeimehr,Mehdi Fartash,Javad Akbari Torkestani
DOI: https://doi.org/10.1007/s11063-022-11122-y
IF: 2.565
2023-01-04
Neural Processing Letters
Abstract:In recent years, the deep neural network has been introduced as an effective learning method in many natural language processing (NLP) applications. One of these applications is named entity recognition (NER), which is considered a vital role in the NLP systems (e.g., question/answering systems and translators). Since extracting entities traditionally requires massive computations to identify features manually (e.g., specific dictionaries), deep neural network methods have been introduced to overcome this challenge. In this work, we introduce a novel architecture that combines two different models of deep learning, namely convolutional neural network (CNN) and long short term memory (LSTM), to extract more efficient properties from an input sentence. The CNN extracts the local features of the individual words, and the LSTM network formulates the contextual information of the input sentence. In addition, thanks to an attention layer in our architecture, the performance has been improved. We implemented our experiments on two public datasets, CoNLL03 and ACE05. Evaluations demonstrate that employing the components of word-level CNN to capture local information of the input sentence and attention mechanism to focus on more relevant words leads to an enhancement in the performance of the NER system and establishes state-of-the-art results. Our architecture achieves F-score 92.00 and 86.34 on the two datasets CoNLL03 and ACE05, respectively. Comparing the previous works that utilize manually feature extraction computations or employ fewer components in their systems, the superiority of the proposed architecture in terms of accuracy is proven.
computer science, artificial intelligence