Dual-axial Self-Attention Network for Text Classification

Zhang Xiaochuan,Qiu Xipeng,Pang Jianmin,Liu Fudong,Li Xingwei
DOI: https://doi.org/10.1007/s11432-019-2744-2
2021-01-01
Science China Information Sciences
Abstract:Text classification is an important task in natural language processing and numerous studies aim to improve the accuracy and efficiency of text classification models. In this study, we propose an effective and efficient text classification model which is based on self-attention solely. The recently proposed multidimensional self-attention significantly improved the performance of self-attention. However, existing models suffer from two major limitations:(1) the previous multi-dimensional self-attention models are quite timeconsuming;(2) the dependencies of elements along the feature axis are not taken into account. To overcome these problems, in this paper, a much more computational efficient multi-dimensional self-attention model is proposed, and two parallel self-attention modules, called dual-axial self-attention, are applied to capture rich dependencies along the feature axis as well as the text axis. A text classification model is then derived.The experimental results on eight representative datasets show that the proposed text classification model can obtain state-of-the-art results and the proposed self-attention outperforms conventional self-attention models.
What problem does this paper attempt to address?