Bidirectional Gated Temporal Convolution with Attention for text classification

Jiansi Ren,Wei Wu,Gang Liu,Zhe Chen,Ruoxiang Wang
DOI: https://doi.org/10.1016/j.neucom.2021.05.072
IF: 6
2021-09-01
Neurocomputing
Abstract:<p>In text classification models based on deep learning, feature extraction and feature aggregation are two key steps. As one of the basic feature extraction methods, CNN has certain limitations due to its inability to e_ectively extract temporal features from text data. Using max-pooling can signi_cantly reduce the amount of calculation while performing feature aggregation, but it will have an adverse e_ect on the classi_cation results due to the loss of some text features. In this paper, in response to the above two issues, a Bidirectional Gated Temporal Convolutional Attention(BG-TCA) model is proposed. In the feature extraction stage, the BG-TCA model uses the bidirectional TCN to extract the bidirectional temporal features in text data, and a gating mechanism similar to the LSTM is added between the convolution layers. In the feature aggregation stage, the BG-TCA model uses the attention mechanism to replace the max-pooling method, which makes it possible to distinguish the importance of different features while retaining the text features to the maximum. Finally, experimental results on five benchmark datasets show that the classification accuracy of the BG-TCA model has been greatly improved compared to basic models, and is better than several other state-of-the-art models.</p>
computer science, artificial intelligence
What problem does this paper attempt to address?