Abstract:In recent time, lexicon-based LSTM and pre-training language models are combined to explore the Chinese Named Entity Recognition (NER) and achieve the current state-of-the-art (SOTA) performance on several Chinese benchmark datasets. However, existing lexicon-based models only conform lexicon features through shallow and randomly initialized coding layers and do not integrate them into the bottom layer of the pre-training language model to mine the deep lexicon knowledge. To address the above issue, we propose a novel BERT-based Enhanced Lexicon Adapter (BLA) model that fuses external lexicon feature into the pre-training language model BERT in-depth. Specifically, the external lexicon knowledge is integrated into the deep Transformer layers of BERT by the lexicon adapter mechanism. With the comparison of existing methods, our model achieves the genuine deep fusion of the lexicon knowledge and BERT representation, effectively obtaining entity boundaries and word information.Besides, given the value of high-level global semantic features in alleviating word ambiguity and segmenting precisely the entity boundary in Chinese NER, transforming the sequence labeling task into sequence generation task provides the new cogitation for extracting global semantic features. Therefore, we explore the strategies of local lexicon information's fusion and global semantic features extraction for entity category labeling. Specifically, we utilize the sequence-to-sequence (Seq2Seq) framework with pointer network as the prominent model architecture, in which the pointing function implements a custom attention mechanism and models different interactions between the source text and the semantic embedding by the generated probability . Furthermore, the decoder with the pointer mechanism generates the target sequence autoregressively. Experiments on several different benchmark Chinese datasets indicate that the proposed model achieves remarkable improvement compared with the current lexicon-based methods, and the results significantly outperform the current SOTA models.

Lexicon enhanced Chinese named entity recognition with pointer network

Boosting Collective Entity Linking via Type-Guided Semantic Embedding.

Enhanced Chinese named entity recognition with multi-granularity BERT adapter and efficient global pointer

A Local Information Perception Enhancement–Based Method for Chinese NER

Named entity recognition for Chinese based on global pointer and adversarial training

Enhanced Chinese Domain Named Entity Recognition: An Approach with Lexicon Boundary and Frequency Weight Features

Lex-BERT: Enhancing BERT based NER with lexicons

LB-BMBC: MHBiaffine-CNN to Capture Span Scores with BERT Injected with Lexical Information for Chinese NER

A Lexicon-Based Graph Neural Network for Chinese NER

A Chinese named entity recognition model: integrating label knowledge and lexicon information

Dependency syntax guided BERT-BiLSTM-GAM-CRF for Chinese NER

Chinese Named Entity Recognition Augmented with Lexicon Memory

A hybrid Transformer approach for Chinese NER with features augmentation

Pronounce Differently, Mean Differently: A Multi-Tagging-scheme Learning Method for Chinese NER Integrated with Lexicon and Phonetic Features

A Model for Chinese Named Entity Recognition Based on Global Pointer and Adversarial Learning

The interactive fusion of characters and lexical information for Chinese named entity recognition

CNN-Based Chinese NER with Lexicon Rethinking

ELCA: Enhanced boundary location for Chinese named entity recognition via contextual association

SLK-NER: Exploiting Second-order Lexicon Knowledge for Chinese NER

KGNER: Improving Chinese Named Entity Recognition by BERT Infused with the Knowledge Graph

Chinese Named Entity Recognition Method Combining ALBERT and a Local Adversarial Training and Adding Attention Mechanism