MIFM: Multi-Granularity Information Fusion Model for Chinese Named Entity Recognition

Naixin Zhang,Guangluan Xu,Zequen Zhang,Feng Li
DOI: https://doi.org/10.1109/access.2019.2958959
IF: 3.9
2019-01-01
IEEE Access
Abstract:Chinese Named Entity Recognition (Chinese NER) is an important task in Chinese natural language processing field. It is difficult to identify the boundary of entities because Chinese texts lack natural delimiters to separate words. For this task, two major methods can be distinguished by the model inputs, i.e., word-based model and character-based model. However, the word-based model relies on the result of the Chinese Word Segmentation (CWS), and the character-based model cannot utilize enough word-level information. In this paper, we propose a multi-granularity information fusion model (MIFM) for the Chinese NER task. We introduce a novel multi-granularity embedding layer that utilizes the attention mechanism and an information gate to fuse the character and word level features. The results of this embedding method are dynamic and data-specific because they are calculated based on different contexts. Moreover, we apply the reverse stacked LSTM layer to gain deep semantic information for a sequence. Experiments on two benchmark datasets, MSRA and ResumeNER, show that our approach can effectively improve the performance of Chinese NER.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?