Abstract:Text classification involves annotating text data with specific labels and is a crucial research task in the field of natural language processing. Chinese text classification presents significant challenges due to the complex semantics of the language, difficulties in semantic feature extraction, and the interleaving and irregularity of lexical features. Traditional methods often struggle to manage the relationships between words and sentences in Chinese, hindering the model's ability to capture deep semantic information and resulting in poor classification performance. To address these issues, a Chinese text classification method based on utterance information enhancement and feature fusion is proposed. This method first embeds the text into a unified space and obtains feature representations of word vectors and sentence vectors using the BERT (Bidirectional Encoder Representations from Transformers) pre-trained language model. Subsequently, an utterance information enhancement module is constructed to perform syntactic enhancement and feature extraction on the sentence information within the text. Additionally, a feature fusion strategy is introduced to combine the enhanced sentence-level information features with the word-level features extracted by the Bi-GRU (Bidirectional Gated Recurrent Unit network), culminating in the classification output. This approach effectively enhances the feature representation of Chinese text and significantly filters out irrelevant and noisy information. Evaluations on several Chinese datasets demonstrate that the proposed method surpasses existing mainstream classification models in terms of classification accuracy and F1 value, validating its effectiveness and feasibility.

Global Semantic Information Extraction Model for Chinese long text classification based on fine-tune BERT

Chinese Text Classification Using BERT and Flat-Lattice Transformer.

A Long-Text Classification Method of Chinese News Based on BERT and CNN

Long Text Classification Based on BERT

Research on Information Extraction of LCSTS Dataset Based on an Improved BERTSum-LSTM Model

Feature-Enhanced Nonequilibrium Bidirectional Long Short-Term Memory Model for Chinese Text Classification

Extraction of temporal information from social media messages using the BERT model

Improved Chinese Short Text Classification Method Based on ERNIE_BiGRU Model

Chinese Text Classification Model Based On Bert And Capsule Network Structure

Chinese text classification method based on sentence information enhancement and feature fusion

Research on Text Classification Based on BERT-BiGRU Model

Chinese text multi-classification based on Sentences Order Prediction improved Bert model

Feature-enhanced text-inception model for Chinese long text classification

Chinese Text Classification Method Based on BERT Word Embedding

Research on sentiment classification for netizens based on the BERT-BiLSTM-TextCNN model

RoBERTa-wwm-ext Fine-Tuning for Chinese Text Classification

Long short-term memory (LSTM)-based news classification model

How to Fine-Tune BERT for Text Classification?