Chinese Text Classification Method Based on BERT Word Embedding

Ziniu Wang,Zhilin Huang,Jianling Gao
DOI: https://doi.org/10.1145/3395260.3395273
2020-04-10
Abstract:In this paper, we enhance the semantic representation of the word through the BERT pre-training language model, dynamically generates the semantic vector according to the context of the character, and then inputs the character vector embedded as a character-level word vector sequence into the CapsNet.We builted the BiGRU module in the capsule network for text feature extraction, and introduced attention mechanism to focus on key information.We use the corpus of baidu's Chinese question and answer data set and only take the types of questions as classified samples to conduct experiments.We used the separate BERT network and the CapsNet as a comparative experiment. Finally, the experimental results show that the model effect is better than using one of the models alone, and the effect is improved.
What problem does this paper attempt to address?