Abstract:Integrating lexical information into Chinese character embedding is a valid method to figure out the Chinese named entity recognition (NER) issue. However, most existing methods focus only on the discovery of named entity boundaries, considering only the words matched by the Chinese characters. They ignore the association between Chinese characters and their left and right matching words. They ignore the local semantic information of the character’s neighborhood, which is crucial for Chinese NER. The Chinese language incorporates a significant number of polysemous words, meaning that a single word can possess multiple meanings. Consequently, in the absence of sufficient contextual information, individuals may encounter difficulties in comprehending the intended meaning of a text, leading to the emergence of ambiguity. We consider how to handle the issue of entity ambiguity because of polysemous words in Chinese texts in different contexts more simply and effectively. We propose in this paper the use of graph attention networks to construct relatives among matching words and neighboring characters as well as matching words and adding left- and right-matching words directly using semantic information provided by the local lexicon. Moreover, this paper proposes a short-sequence convolutional neural network (SSCNN). It utilizes the generated shorter subsequence encoded with the sliding window module to enhance the perception of local information about the character. Compared with the widely used Chinese NER models, our approach achieves 1.18%, 0.29%, 0.18%, and 1.1% improvement on the four benchmark datasets Weibo, Resume, OntoNotes, and E-commerce, respectively, and proves the effectiveness of the model.

Research on new word identification

New Word Identification in Social Network Text Based on Time Series Information

Chinese Word Segmentation Method Based on Dictionary and Frequency of the Words

Research on algorithm for networks new words identification

New Words Recognition Algorithm and Application Based on Micro-Blog Hot

SVM-based Hybrid Pattern for New Word Discovery

Research on Intelligent Construction of China English Network New Words Database Based on Adjacent Entropy Recognition Algorithm

A study on the classification of stylistic and formal features in English based on corpus data testing

New Word Detection Using BiLSTM+CRF Model with Features

Internet-oriented Chinese New Words Detection

Research on Automatic Recognition of Separable Words in Modern Chinese

New Cyber Word Discovery Using Chinese Word Segmentation

New Word Detection For Sentiment Analysis

Survey on Chinese Word Segmentation

A Local Information Perception Enhancement–Based Method for Chinese NER

Implementing Chinese new word discovery and POS tagging based on support vector machine

Researches on Word Sense Discrimination of Chinese Adjective

Incorporating user behaviors in new word detection

Detecting new Chinese words from massive domain texts with word embedding

A realistic and robust model for Chinese word segmentation

Chinese Word Segmentation Probability Dictionary Training and Enrich Solution