MaTrRec: Uniting Mamba and Transformer for Sequential Recommendation

Shun Zhang,Runsen Zhang,Zhirong Yang
2024-07-27
Abstract:Sequential recommendation systems aim to provide personalized recommendations by analyzing dynamic preferences and dependencies within user behavior sequences. Recently, Transformer models can effectively capture user preferences. However, their quadratic computational complexity limits recommendation performance on long interaction sequence data. Inspired by the State Space Model (SSM)representative model, Mamba, which efficiently captures user preferences in long interaction sequences with linear complexity, we find that Mamba's recommendation effectiveness is limited in short interaction sequences, with failing to recall items of actual interest to users and exacerbating the data sparsity cold start problem. To address this issue, we innovatively propose a new model, MaTrRec, which combines the strengths of Mamba and Transformer. This model fully leverages Mamba's advantages in handling long-term dependencies and Transformer's global attention advantages in short-term dependencies, thereby enhances predictive capabilities on both long and short interaction sequence datasets while balancing model efficiency. Notably, our model significantly improves the data sparsity cold start problem, with an improvement of up to 33% on the highly sparse Amazon Musical Instruments dataset. We conducted extensive experimental evaluations on five widely used public datasets. The experimental results show that our model outperforms the current state-of-the-art sequential recommendation models on all five datasets. The code is available at <a class="link-external link-https" href="https://github.com/Unintelligentmumu/MaTrRec" rel="external noopener nofollow">this https URL</a>.
Information Retrieval
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address several key issues in sequential recommendation systems: 1. **Handling Long and Short Interaction Sequences**: - **Long Interaction Sequences**: Long interaction sequences provide rich information about user historical preferences, helping to reveal users' long-term interest patterns and behavior habits. However, processing long sequences requires more computational resources and more complex models to capture dependencies within the sequence. - **Short Interaction Sequences**: Short interaction sequences, due to their relatively short average length, provide limited information about user interests and preferences, making it difficult to comprehensively depict user interests. Therefore, the model needs to efficiently capture users' current interests with limited data. 2. **Data Sparsity and Cold Start Problem**: - **Data Sparsity**: Short interaction sequences face the problem of data sparsity, lacking sufficient historical data, making it difficult for the model to accurately capture user interests, thereby affecting recommendation quality. - **Cold Start Problem**: Insufficient historical data for new users or new items leads to poor model performance in recommendations. 3. **Limitations of Existing Models**: - **Transformer Model**: Although the Transformer model can effectively capture user preferences, its quadratic computational complexity limits its recommendation performance on long interaction sequence data. - **Mamba Model**: The Mamba model performs well in handling long interaction sequences but has a low recall rate when dealing with short interaction sequences, exacerbating the data sparsity and cold start problems. ### Solution To address the above issues, the authors propose a new hybrid model—MaTrRec, which combines the advantages of Mamba and Transformer: - **Mamba Module**: Utilizes the Mamba model's strengths in handling long interaction sequences, effectively capturing long-term dependencies. - **Transformer Module**: Utilizes the Transformer model's global attention mechanism, effectively handling short-term dependencies in short interaction sequences. Through this combination, the MaTrRec model not only performs well on both long and short interaction sequence datasets but also significantly improves data sparsity and cold start problems. Experimental results show that MaTrRec outperforms existing state-of-the-art sequential recommendation models on multiple public datasets.