Incorporating lexicon knowledge into Chinese NER using hierarchical meta-embedding

Shuo Liu,Yinliang Zhao
DOI: https://doi.org/10.1117/12.2626480
2021-01-01
Abstract:Integrating lexicon knowledge into character-based methods can improve the performance of neural network models for Chinese named entity recognition (NER). For example, Lattice LSTM [1]and WC-LSTM [2] perform well on several public Chinese NER datasets. However, the directed acyclic graph (DAG) structure makes lattice LSTM challenging to train on minibatch. In addition, the Lattice LSTM and WC-LSTM only incorporate the word-level semantics into the representation of the first or last character in each word. The inside characters that the word contain are ignored. Besides, they have difficulty in dealing with the conflicts between potential words in the lexicon. This work proposes an attention-based hierarchical meta-embedding method (AHME) to incorporate lexicon knowledge into Chinese NER to alleviate the above limitations. The proposed model can incorporate the word boundary information into character representation and deal with conflicts between potential incorporated words. The experimental results on four datasets show that our method outperforms state-of-the-art baselines.
What problem does this paper attempt to address?