Global-Local Mutual Attention Model for Text Classification

Qianli Ma,Liuhong Yu,Shuai Tian,Enhuan Chen,Wing W. Y. Ng
DOI: https://doi.org/10.1109/taslp.2019.2942160
2019-01-01
IEEE/ACM Transactions on Audio Speech and Language Processing
Abstract:Text classification is a central field of inquiry in natural language processing (NLP). Although some models learn local semantic features and global long-term dependencies simultaneously, they simply combine them through concatenation either in a cascade way or in parallel while mutual effects between them are ignored. In this paper, we propose the Global-Local Mutual Attention (GLMA) model for text classification problems, which introduces a mutual attention mechanism for mutual learning between local semantic features and global long-term dependencies. The mutual attention mechanism consists of a Local-Guided Global-Attention (LGGA) and a Global-Guided Local-Attention (GGLA). The LGGA allows to assign weights and combine global long-term dependencies of word positions that are semantic related. It captures combined semantics and alleviates the gradient vanishing problem. The GGLA automatically assigns more weights to relevant local semantic features, which captures key local semantic information and filters both noises and irrelevant words/phrases. Furthermore, a weighted-over-time pooling operation is developed to aggregate the most informative and discriminative features for classification. Extensive experiments demonstrate that our model obtains the state-of-the-art performance on seven benchmark datasets and sixteen Amazon product reviews datasets. Both the result analysis and the mutual attention weights visualization further demonstrate the effectiveness of the proposed model.
What problem does this paper attempt to address?