Practical Lessons from Predicting Clicks on Ads at Facebook

Xinran He,Junfeng Pan,Ou Jin,Tianbing Xu,Bo Liu,Tao Xu,Yanxin Shi,Antoine Atallah,Ralf Herbrich,Stuart Bowers,Joaquin Quiñonero Candela
DOI: https://doi.org/10.1145/2648584.2648589
2014-01-01
Abstract:Online advertising allows advertisers to only bid and pay for measurable user responses, such as clicks on ads. As a consequence, click prediction systems are central to most online advertising systems. With over 750 million daily active users and over 1 million active advertisers, predicting clicks on Facebook ads is a challenging machine learning task. In this paper we introduce a model which combines decision trees with logistic regression, outperforming either of these methods on its own by over 3%, an improvement with significant impact to the overall system performance. We then explore how a number of fundamental parameters impact the final prediction performance of our system. Not surprisingly, the most important thing is to have the right features: those capturing historical information about the user or ad dominate other types of features. Once we have the right features and the right model (decisions trees plus logistic regression), other factors play small roles (though even small improvements are important at scale). Picking the optimal handling for data freshness, learning rate schema and data sampling improve the model slightly, though much less than adding a high-value feature, or picking the right model to begin with.
What problem does this paper attempt to address?