Optimizing Enhanced Cost Per Click Via Reinforcement Learning Without Exploration

Sinan Li,Chun Yuan,Xin Zhu
DOI: https://doi.org/10.1109/ijcnn52387.2021.9534126
2021-01-01
Abstract:Real-Time Bidding (RTB) is an important mechanism in online search advertising involved with the interaction among platform, advertisers, and customers, where the proper bid for each page views plays a crucial role for good marketing results. Enhanced Cost Per Click (ECPC) bidding is one of typical bidding strategies in the RTB mechanism where the advertisers will raise bid prices for traffic with high conversion rates under the premise of total advertisers' cost control in different time periods of the day. The bid ratio in ECPC as a correction coefficient is used to control the advertising cost of advertisers. However, the optimal bid ratio is hard to be derived due to the complexity and volatility of the auction environment. To address the challenges, we formulate the hourly-aggregated level of ECPC bidding as a Markov Decision Process with innovative state and reward function according to business. Besides, quite different from prior work that has to construct an auction model-based environment to solve the MDP problem, we propose a novel framework based on model-free reinforcement learning that does not rely on the agent's explorations in the environment. We furthermore design an adaptive action choosing strategy which adjusts the bidding behavior dynamically and improves the performance of categories/groups's consumption qualified rate. Experimental results on real dataset demonstrate the effectiveness of our framework from the perspective of consistency, stability and online effect.
What problem does this paper attempt to address?