Real-time bidding with multi-agent reinforcement learning in multi-channel display advertising
Chen Chen,Gao Wang,Baoyu Liu,Siyao Song,Keming Mao,Shiyu Yu,Jingyu Liu
DOI: https://doi.org/10.1007/s00521-024-10649-6
2024-11-25
Neural Computing and Applications
Abstract:Real-time bidding is the main way to display advertisements in the current e-commerce market. To maximize the revenue and investment reporting (ROI) brought by advertising, the platform should not only meet the needs of various brands for accurate audiences but also fully adapt to the complex and volatile e-commerce environment. However, the existing research still stays on the single-sequence real-time auction, without paying attention to the new business environment of multi-channel simultaneous auction. In this paper, we use multi-agent reinforcement learning to build an intelligent bidding optimization model, and design and implement MCMAB algorithm (multi-channel multi-agent bidding algorithm) based on MADDPG algorithm (multi-agent deep deterministic policy gradient algorithm). First of all, for special business backgrounds, we designed a specific loss function and used a reward model which pretrained with the preference model pretraining (PMP) to capture user preferences. Secondly, in order to solve the problem of PV data dimension confusion in a multi-channel environment, we propose a new data preprocessing scheme, which not only effectively solves the problem, but also improves the convergence rate of the algorithm. On this basis, our model for offline bidding needs to use a lot of offline data for training and testing. So we propose a new simulation data generation algorithm, and improve the algorithm structure of MCMAB algorithm concerning the TD3 algorithm idea, so that it can adapt to the offline environment. Finally, different experiments on different data sets verify the effectiveness of the method.
computer science, artificial intelligence