Abstract:Integrating lexical information into Chinese character embedding is a valid method to figure out the Chinese named entity recognition (NER) issue. However, most existing methods focus only on the discovery of named entity boundaries, considering only the words matched by the Chinese characters. They ignore the association between Chinese characters and their left and right matching words. They ignore the local semantic information of the character’s neighborhood, which is crucial for Chinese NER. The Chinese language incorporates a significant number of polysemous words, meaning that a single word can possess multiple meanings. Consequently, in the absence of sufficient contextual information, individuals may encounter difficulties in comprehending the intended meaning of a text, leading to the emergence of ambiguity. We consider how to handle the issue of entity ambiguity because of polysemous words in Chinese texts in different contexts more simply and effectively. We propose in this paper the use of graph attention networks to construct relatives among matching words and neighboring characters as well as matching words and adding left- and right-matching words directly using semantic information provided by the local lexicon. Moreover, this paper proposes a short-sequence convolutional neural network (SSCNN). It utilizes the generated shorter subsequence encoded with the sliding window module to enhance the perception of local information about the character. Compared with the widely used Chinese NER models, our approach achieves 1.18%, 0.29%, 0.18%, and 1.1% improvement on the four benchmark datasets Weibo, Resume, OntoNotes, and E-commerce, respectively, and proves the effectiveness of the model.

Error feedback based lexical entity extraction for Chinese language modeling

A CRF-based Method for Automatic Construction of Chinese Symptom Lexicon

A Local Information Perception Enhancement–Based Method for Chinese NER

Pronounce Differently, Mean Differently: A Multi-Tagging-scheme Learning Method for Chinese NER Integrated with Lexicon and Phonetic Features

Chinese Medical Entity Recognition Model Based on Character and Word Vector Fusion

Dependency syntax guided BERT-BiLSTM-GAM-CRF for Chinese NER

Enhanced Chinese Domain Named Entity Recognition: An Approach with Lexicon Boundary and Frequency Weight Features

Simplify the Usage of Lexicon in Chinese NER

Chinese Named Entity Recognition Augmented with Lexicon Memory

SLK-NER: Exploiting Second-order Lexicon Knowledge for Chinese NER

A Lexicon-Based Graph Neural Network for Chinese NER

Improving the Named Entity Recognition of Chinese Electronic Medical Records by Combining Domain Dictionary and Rules

Chinese named entity recognition via word boundary based character embedding

Improving Microblog Retrieval with Feedback Entity Model.

Using Large Language Model for End-to-End Chinese ASR and NER

A Chinese OCR Spelling Check Approach Based on Statistical Language Models.

IMPROVING CHINESE MEDICAL NAMED ENTITY RECOGNITION USING GLYPH AND LEXICON

KCB-FLAT: Enhancing Chinese Named Entity Recognition with Syntactic Information and Boundary Smoothing Techniques

Chinese Name Entity Extraction System Based on a Hybrid Model

Joint n-gram Chinese language modeling with an application to Chinese word segmentation

A Chinese named entity recognition model incorporating recurrent cell and information state recursion