Local or Global? A Novel Transformer for Chinese Named Entity Recognition Based on Multi-View and Sliding Attention

Yuke Wang,Ling Lu,Wu Yang,Yinong Chen
DOI: https://doi.org/10.1007/s13042-023-02023-0
2024-01-01
International Journal of Machine Learning and Cybernetics
Abstract:Transformer is widely used in natural language processing (NLP) tasks due to the parallel and modeling of long texts. However, its performance in Chinese named entity recognition (NER) is not effective. While distance, direction, and information on global and local perspectives of sequence are all important for NER tasks, the traditional transformer structure only focus on distance and partial global information by fully connected self-attention mechanism. In this paper, we propose a multi-view and sliding attention (MVSA) model to enhance transformer’s ability to model Chinese character-word features in NER task. MVSA combines directional information to extract character-word features from multiple views, proposes a weighted ternary fusion method for feature fusion and uses slider attention mechanisms to enhance the local representation ability of the model. Experiments on five Chinese NER datasets show that MVSA achieves superior performance than CNN-based, LSTM-based and traditional transformer-based models.
What problem does this paper attempt to address?