STRec: Sparse Transformer for Sequential Recommendations.

Chengxi Li,Yejing Wang,Qidong Liu,Xiangyu Zhao,Wanyu Wang,Yiqi Wang,Lixin Zou,Wenqi Fan,Qing Li
DOI: https://doi.org/10.1145/3604915.3608779
2023-01-01
Abstract:With the rapid evolution of transformer architectures, researchers are exploring their application in sequential recommender systems (SRSs) and presenting promising performance on SRS tasks compared with former SRS models. However, most existing transformer-based SRS frameworks retain the vanilla attention mechanism, which calculates the attention scores between all item-item pairs. With this setting, redundant item interactions can harm the model performance and consume much computation time and memory. In this paper, we identify the sparse attention phenomenon in transformer-based SRS models and propose Sparse Transformer for sequential Recommendation tasks (STRec) to achieve the efficient computation and improved performance. Specifically, we replace self-attention with cross-attention, making the model concentrate on the most relevant item interactions. To determine these necessary interactions, we design a novel sampling strategy to detect relevant items based on temporal information. Extensive experimental results validate the effectiveness of STRec, which achieves the state-of-the-art accuracy while reducing 54% inference time and 70% memory cost. We also provide massive extended experiments to further investigate the property of our framework.
What problem does this paper attempt to address?