Word-Character Graph Convolution Network for Chinese Named Entity Recognition.

Zhuo Tang,Boyan Wan,Li Yang
DOI: https://doi.org/10.1109/taslp.2020.2994436
2020-01-01
Abstract:Recent researches try to integrate word information into the character-based Chinese NER by modifying the structure of the standard BiLSTM-CRF model. They follow the paradigm of explicitly modeling forward and backward sequences, adopting an LSTM variant that takes both characters and words as input for each direction. Though enriching the representations, these models cannot fully exploit the interaction between future and past contexts. In this paper, we propose a novel word-character graph convolution network (WC-GCN) which uses a cross GCN block to simultaneously process the word-character directed acyclic graphs (DAGs) of two directions. To improve the capture of long-distance dependency, a global attention GCN block is introduced to learn node representations conditioned on a global context. In both blocks, unlike previous works where each word is attached to its associated character or taken as a shortcut between LSTM cells, words and characters are treated equally as nodes in the graph and have their instance-specific representations. Experiments on four widely used datasets show that our proposed model can work standalone or with the standard BiLSTM. Both forms can outperform previous LSTM-based models without training on extra corpora while only an external lexicon and its corresponding pretrained character and word embeddings are needed.
What problem does this paper attempt to address?