Abstract:Integrating lexical information into Chinese character embedding is a valid method to figure out the Chinese named entity recognition (NER) issue. However, most existing methods focus only on the discovery of named entity boundaries, considering only the words matched by the Chinese characters. They ignore the association between Chinese characters and their left and right matching words. They ignore the local semantic information of the character’s neighborhood, which is crucial for Chinese NER. The Chinese language incorporates a significant number of polysemous words, meaning that a single word can possess multiple meanings. Consequently, in the absence of sufficient contextual information, individuals may encounter difficulties in comprehending the intended meaning of a text, leading to the emergence of ambiguity. We consider how to handle the issue of entity ambiguity because of polysemous words in Chinese texts in different contexts more simply and effectively. We propose in this paper the use of graph attention networks to construct relatives among matching words and neighboring characters as well as matching words and adding left- and right-matching words directly using semantic information provided by the local lexicon. Moreover, this paper proposes a short-sequence convolutional neural network (SSCNN). It utilizes the generated shorter subsequence encoded with the sliding window module to enhance the perception of local information about the character. Compared with the widely used Chinese NER models, our approach achieves 1.18%, 0.29%, 0.18%, and 1.1% improvement on the four benchmark datasets Weibo, Resume, OntoNotes, and E-commerce, respectively, and proves the effectiveness of the model.

VCWE: Visual Character-Enhanced Word Embeddings

Enhanced Double-Carrier Word Embedding Via Phonetics and Writing

Joint Learning of Character and Word Embeddings.

Glyph-aware Embedding of Chinese Characters

Visualizing and Understanding Neural Models in NLP

Component-Enhanced Chinese Character Embeddings

cw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information

Learning Context-Specific Word/Character Embeddings.

Improved Learning of Chinese Word Embeddings with Semantic Knowledge.

Combination Methods of Chinese Character and Word Embeddings in Deep Learning

Hierarchical Joint Learning for Chinese Word Embeddings

Learning Chinese Word Embeddings from Stroke, Structure and Pinyin of Characters

Learning Chinese Word Representations From Glyphs Of Characters

Multiple Character Embeddings for Chinese Word Segmentation

A Deep Convolutional Neural Model for Character-Based Chinese Word Segmentation

A Local Information Perception Enhancement–Based Method for Chinese NER

Word-Context Character Embeddings for Chinese Word Segmentation.

Learning Chinese-Japanese Bilingual Word Embedding by Using Common Characters.

Context-Specific and Multi-Prototype Character Representations.

Enhancing Chinese Intent Classification by Dynamically Integrating Character Features into Word Embeddings with Ensemble Techniques

Radical and Stroke-Enhanced Chinese Word Embeddings Based on Neural Networks