FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction

Kelong Mao,Jieming Zhu,Liangcai Su,Guohao Cai,Yuru Li,Zhenhua Dong
2023-11-30
Abstract:Click-through rate (CTR) prediction is one of the fundamental tasks for online advertising and recommendation. While multi-layer perceptron (MLP) serves as a core component in many deep CTR prediction models, it has been widely recognized that applying a vanilla MLP network alone is inefficient in learning multiplicative feature interactions. As such, many two-stream interaction models (e.g., DeepFM and DCN) have been proposed by integrating an MLP network with another dedicated network for enhanced CTR prediction. As the MLP stream learns feature interactions implicitly, existing research focuses mainly on enhancing explicit feature interactions in the complementary stream. In contrast, our empirical study shows that a well-tuned two-stream MLP model that simply combines two MLPs can even achieve surprisingly good performance, which has never been reported before by existing work. Based on this observation, we further propose feature gating and interaction aggregation layers that can be easily plugged to make an enhanced two-stream MLP model, FinalMLP. In this way, it not only enables differentiated feature inputs but also effectively fuses stream-level interactions across two streams. Our evaluation results on four open benchmark datasets as well as an online A/B test in our industrial system show that FinalMLP achieves better performance than many sophisticated two-stream CTR models. Our source code will be available at MindSpore/models.
Information Retrieval
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily addresses issues in Click-Through Rate (CTR) prediction, specifically focusing on the following aspects: 1. **Validation of Model Effectiveness**: - Through empirical research, it is shown for the first time that a simple dual-stream MLP model (DualMLP) can achieve surprisingly good performance, even surpassing existing complex dual-stream models. This contradicts the common view in the literature. 2. **Proposing an Enhanced Model FinalMLP**: - An enhanced dual-stream MLP model, FinalMLP, is proposed. This model integrates feature gating and interaction aggregation layers, enabling the model to better handle interactions between features. 3. **Experimental Validation**: - Offline experiments were conducted on 4 public benchmark datasets, and online A/B tests were performed in an industrial system to validate the effectiveness and significant performance improvement of FinalMLP. ### Summary The paper mainly focuses on improving the performance of traditional Multi-Layer Perceptrons (MLPs) in CTR prediction tasks for online advertising and recommendation systems. Specifically, the authors discovered a simple yet powerful dual-stream MLP model (DualMLP) and further enhanced it by introducing feature gating and interaction aggregation layers, resulting in the FinalMLP model. Experimental results show that FinalMLP not only achieves the best performance on multiple public datasets but also demonstrates significant performance improvements in practical applications.