Causality-based CTR Prediction using Graph Neural Networks

Panyu Zhai,Yanwu Yang,Chunjie Zhang
DOI: https://doi.org/10.1016/j.ipm.2022.103137
2023-01-30
Abstract:As a prevalent problem in online advertising, CTR prediction has attracted plentiful attention from both academia and industry. Recent studies have been reported to establish CTR prediction models in the graph neural networks (GNNs) framework. However, most of GNNs-based models handle feature interactions in a complete graph, while ignoring causal relationships among features, which results in a huge drop in the performance on out-of-distribution data. This paper is dedicated to developing a causality-based CTR prediction model in the GNNs framework (Causal-GNN) integrating representations of feature graph, user graph and ad graph in the context of online advertising. In our model, a structured representation learning method (GraphFwFM) is designed to capture high-order representations on feature graph based on causal discovery among field features in gated graph neural networks (GGNNs), and GraphSAGE is employed to obtain graph representations of users and ads. Experiments conducted on three public datasets demonstrate the superiority of Causal-GNN in AUC and Logloss and the effectiveness of GraphFwFM in capturing high-order representations on causal feature graph.
Information Retrieval,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in the click - through rate (CTR) prediction task in online advertising, existing models based on graph neural networks (GNNs) ignore the causal relationships between features when dealing with feature interactions, resulting in a significant decline in performance on out - of - distribution data. Specifically, most existing GNNs models aggregate the information of all neighbors in the same way in a complete graph, which not only easily leads to the over - smoothing problem but also fails to take into account the internal relationships between features, thus affecting the generalization ability and robustness of the model. To solve the above problems, this paper proposes a causal - relationship - based CTR prediction model (Causal - GNN), which integrates the representations of feature graphs, user graphs, and advertisement graphs under the GNNs framework. By designing a structured representation learning method (GraphFwFM), it captures high - order representations based on causal discovery between field features in gated graph neural networks (GGNNs), and uses GraphSAGE to obtain graph representations of users and advertisements. In addition, the multi - head attention mechanism is used to fuse the graph representations of features, users, and advertisements, and the neural attention - aware predictor predicts the click probability according to the attention - weighted representation. The main contributions of this research are: 1. Proposing a causal - relationship - based CTR prediction model that combines multiple graph - enabled feature representations under the GNNs framework. 2. Designing a structured representation learning method (GraphFwFM) based on causal discovery to capture causal feature representations in GNNs. 3. Through experiments on three public datasets, demonstrating the superior performance of the Causal - GNN model in the CTR prediction task and the effectiveness of GraphFwFM in modeling high - order representations in causal feature graphs.