Research of Chinese Entity Recognition Model Based on Multi-Feature Semantic Enhancement
Ling Yuan,Chenglong Zeng,Peng Pan
DOI: https://doi.org/10.3390/electronics13244895
IF: 2.9
2024-12-13
Electronics
Abstract:Chinese Entity Recognition (CER) aims to extract key information entities from Chinese text data, supporting subsequent natural language processing tasks such as relation extraction, knowledge graph construction, and intelligent question answering. However, CER faces several challenges, including limited training corpora, unclear entity boundaries, and complex entity structures, resulting in low accuracy and a call for further improvements. To address issues such as high annotation costs and ambiguous entity boundaries, this paper proposes the SEMFF-CER model, a CER model based on semantic enhancement and multi-feature fusion. The model employs character feature extraction algorithms, SofeLexicon semantic enhancement for vocabulary feature extraction, and deep semantic feature extraction from pre-trained models. These features are integrated into the entity recognition process via gating mechanisms, effectively leveraging diverse features to enhance contextual semantics and improve recognition accuracy. Additionally, the model incorporates several optimization strategies: an adaptive loss function to balance negative samples and improve the F1 score, data augmentation to enhance model robustness, and dropout and Adamax optimization algorithms to refine training. The SEMFF-CER model is characterized by a low dependence on training corpora, fast computation speed, and strong scalability. Experiments conducted on four Chinese benchmark entity recognition datasets validate the proposed model, demonstrating superior performance over existing models with the highest F1 score.
engineering, electrical & electronic,computer science, information systems,physics, applied