ADA: Adaptive Depth Attention Model for Click-Through Rate Prediction

Shujin Liu,Derong Chen,Jie Shao
DOI: https://doi.org/10.1109/IJCNN52387.2021.9533867
2021-01-01
Abstract:Click-through rate (CTR) prediction plays a critical role in recommender systems, in which the task is to forecast the probability of the user clicking on recommended items. Many models have been proposed in this field, such as logistic regression, factorization machine based models and deep learning based models. However, many current works calculate the feature interactions in a simple way such as inner product. They care less about the different importance and computational requirements of different feature interactions. The general idea is that some complex feature interactions might require more computations to produce a final result, while some simple or unimportant feature interactions might require less. In this paper, we propose adaptive depth attention (ADA) model, a new model that automatically learns the high-order feature interactions of raw data. The core of ADA is a multi-head self-attention neural network that learns feature interaction and a network depth control module that controls the network depth required for interaction of different feature fields. We conduct extensive experiments on two real-world datasets and results demonstrate the superior predictive performance of ADA against the state-of-the-art methods.
What problem does this paper attempt to address?