TriMLP: A Foundational MLP-like Architecture for Sequential Recommendation

Yiheng Jiang,Yuanbo Xu,Yongjian Yang,Funing Yang,Pengyang Wang,Chaozhuo Li,Fuzhen Zhuang,Hui Xiong
DOI: https://doi.org/10.1145/3670995
IF: 4.657
2023-01-01
ACM Transactions on Information Systems
Abstract:In this work, we present TriMLP as a foundational MLP-like architecture for the sequential recommendation, simultaneously achieving computational efficiency and promising performance. First, we empirically study the incompatibility between existing purely MLP-based models and sequential recommendation, that the inherent fully-connective structure endows historical user-item interactions (referred as tokens) with unrestricted communications and overlooks the essential chronological order in sequences. Then, we propose the MLP-based Triangular Mixer to establish ordered contact among tokens and excavate the primary sequential modeling capability under the standard auto-regressive training fashion. It contains (i) a global mixing layer that drops the lower-triangle neurons in MLP to block the anti-chronological connections from future tokens and (ii) a local mixing layer that further disables specific upper-triangle neurons to split the sequence as multiple independent sessions. The mixer serially alternates these two layers to support fine-grained preferences modeling, where the global one focuses on the long-range dependency in the whole sequence, and the local one calls for the short-term patterns in sessions. Experimental results on 12 datasets of different scales from 4 benchmarks elucidate that TriMLP consistently attains favorable accuracy/efficiency trade-off over all validated datasets, where the average performance boost against several state-of-the-art baselines achieves up to 14.88%, and the maximum reduction of inference time reaches 23.73%. The intriguing properties render TriMLP a strong contender to the well-established RNN-, CNN- and Transformer-based sequential recommenders. Code is available at https://github.com/jiangyiheng1/TriMLP .
What problem does this paper attempt to address?