Abstract:Text classification involves annotating text data with specific labels and is a crucial research task in the field of natural language processing. Chinese text classification presents significant challenges due to the complex semantics of the language, difficulties in semantic feature extraction, and the interleaving and irregularity of lexical features. Traditional methods often struggle to manage the relationships between words and sentences in Chinese, hindering the model's ability to capture deep semantic information and resulting in poor classification performance. To address these issues, a Chinese text classification method based on utterance information enhancement and feature fusion is proposed. This method first embeds the text into a unified space and obtains feature representations of word vectors and sentence vectors using the BERT (Bidirectional Encoder Representations from Transformers) pre-trained language model. Subsequently, an utterance information enhancement module is constructed to perform syntactic enhancement and feature extraction on the sentence information within the text. Additionally, a feature fusion strategy is introduced to combine the enhanced sentence-level information features with the word-level features extracted by the Bi-GRU (Bidirectional Gated Recurrent Unit network), culminating in the classification output. This approach effectively enhances the feature representation of Chinese text and significantly filters out irrelevant and noisy information. Evaluations on several Chinese datasets demonstrate that the proposed method surpasses existing mainstream classification models in terms of classification accuracy and F1 value, validating its effectiveness and feasibility.

Semantic Enhancement and Multi-level Label Embedding for Chinese News Headline Classification

A Local Information Perception Enhancement–Based Method for Chinese NER

Semantic Role Labeling Integrated with Multilevel Linguistic Cues and Bi-LSTM-CRF

A Long-Text Classification Method of Chinese News Based on BERT and CNN

Chinese News Event 5W1H Elements Extraction Using Semantic Role Labeling

Chinese text classification method based on sentence information enhancement and feature fusion

Chinese Text Classification Based on Hybrid Model of CNN and LSTM

Long short-term memory (LSTM)-based news classification model

A News Headlines Classification Method Based on the Fusion of Related Words.

Semantic Embedded Deep Neural Network: A Generic Approach to Boost Multi-Label Image Classification Performance

Optimizing News Text Classification with Bi-LSTM and Attention Mechanism for Efficient Data Processing

Chinese text classification based on attention mechanism and feature-enhanced fusion neural network

NewsEmbed: Modeling News through Pre-trained Document Representations

Hierarchical Multi-label Text Classification: Self-adaption Semantic Awareness Network Integrating Text Topic and Label Level Information

MSFM: Multi-view Semantic Feature Fusion Model for Chinese Named Entity Recognition.

A Parallel Two-Channel Emotion Classification Method for Chinese Text

Research on Dual Channel News Headline Classification Based on ERNIE Pre-training Model

Improving Medical Short Text Classification with Semantic Expansion Using Word-Cluster Embedding

Feature-enhanced text-inception model for Chinese long text classification