Stochastic Graphical Bandits with Heavy-Tailed Rewards.

Yutian Gou,Jinfeng Yi,Lijun Zhang
2023-01-01
Abstract:We consider stochastic graphical bandits, where after pulling an arm, the decision maker observes re-wards of not only the chosen arm but also its neighbors in a feedback graph. Most of existing work assumes that the rewards are drawn from bounded or at least sub-Gaussian distributions, which however may be violated in many practical scenarios such as social advertising and financial markets. To settle this issue, we investigate stochastic graphical bandits with heavy-tailed rewards, where the distributions have finite moments of order 1 + (cid:15) , for some (cid:15) ∈ (0 , 1] . Firstly, we develop one UCB-type algorithm, whose expected regret is upper bounded by a sum of gap-based quantities over the clique covering of the feedback graph. The key idea is to estimate the reward means of the selected arm’s neighbors by more refined robust estimators, and to construct a graph-based upper confidence bound for selecting candidates. Secondly, we de-sign another elimination-based strategy and improve the regret bound to a gap-based sum with size controlled by the independence number of the feedback graph. For benign graphs, the independence number could be smaller than the size of the clique covering , resulting in tighter regret bounds. Finally, we conduct experiments on synthetic data to demonstrate the effectiveness of our methods.
What problem does this paper attempt to address?