Per-Round Knapsack-Constrained Linear Submodular Bandits

Baosheng Yu,Meng Fang,Dacheng Tao
DOI: https://doi.org/10.1162/neco_a_00887
IF: 3.278
2016-01-01
Neural Computation
Abstract:Linear submodular bandits has been proven to be effective in solving the diversification and feature-based exploration problem in information retrieval systems. Considering there is inevitably a budget constraint in many web-based applications, such as news article recommendations and online advertising, we study the problem of diversification under a budget constraint in a bandit setting. We first introduce a budget constraint to each exploration step of linear submodular bandits as a new problem, which we call per-round knapsack-constrained linear submodular bandits. We then define an approximation unit-cost regret considering that the submodular function maximization is NP-hard. To solve this new problem, we propose two greedy algorithms based on a modified UCB rule. We prove these two algorithms with different regret bounds and computational complexities. Inspired by the lazy evaluation process in submodular function maximization, we also prove that a modified lazy evaluation process can be used to accelerate our algorithms without losing their theoretical guarantee. We conduct a number of experiments, and the experimental results confirm our theoretical analyses.
What problem does this paper attempt to address?