HiCAN: Hierarchical Convolutional Attention Network for Sequence Modeling.

Yi Cao,Weifeng Zhang,Bo Song,Congfu Xu
DOI: https://doi.org/10.1145/3357384.3357996
2019-01-01
Abstract:Convolutional neural networks (CNN) are widely used on sequential data since it can capture local context dependencies and temporal order information inside sequences. Attention (ATT) mechanisms have also attracted enormous interests due to its capability of capturing the important parts of a sequence. These two neural networks can extract different features from sequences. In order to combine the advantages of CNN and ATT, we propose a convolutional attention network (CAN), which merges the structure of CNN and ATT into a single neural network and can serve as a new basic module in complex neural networks. Based on CAN, we then build a sequence encoding model with hierarchical structure, "hierarchical convolutional attention network (HiCAN)", to tackle sequence modeling problems. It can explicitly capture both the local and global context dependencies and temporal order information in sequences. Extensive experiments conducted on session-based recommendation (Recommender Systems) demonstrate that HiCAN is able to outperform state-of-the-art methods and show higher computational efficiency. Furthermore, we conduct extended experiments on text classification (Natural Language Processing). The results show that our model can also achieve competitive performance on NLP tasks.
What problem does this paper attempt to address?