Optimal Thresholding Linear Bandit

Eduardo Ochoa Rivera,Ambuj Tewari
2024-02-12
Abstract:We study a novel pure exploration problem: the $\epsilon$-Thresholding Bandit Problem (TBP) with fixed confidence in stochastic linear bandits. We prove a lower bound for the sample complexity and extend an algorithm designed for Best Arm Identification in the linear case to TBP that is asymptotically optimal.
Machine Learning
What problem does this paper attempt to address?