Abstract:Thompson sampling (TS) has optimal regret and excellent empirical performance in multi-armed bandit problems. Yet, in Bayesian optimization, TS underperforms popular acquisition functions (e.g., EI, UCB). TS samples arms according to the probability that they are optimal. A recent algorithm, P-Star Sampler (PSS), per- forms such a sampling via Hit-and-Run. We present an improved version, Stagger Thompson Sampler (STS). STS more precisely locates the maximizer than does TS using less computation time. We demonstrate that STS outperforms TS, PSS, and other acquisition methods in numerical experiments of optimizations of sev- eral test functions across a broad range of dimension. Additionally, since PSS was originally presented not as a standalone acquisition method but as an input to a batching algorithm called Minimal Terminal Variance (MTV), we also demon- strate that STS matches PSS performance when used as the input to MTV.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of poor performance of Thompson Sampling (TS) in Bayesian Optimization (BO). Specifically, TS performs well in multi - armed bandit problems, but in Bayesian Optimization it is not as good as other popular acquisition functions, such as Expected Improvement (EI) and Upper Confidence Bound (UCB). #### Main problems 1. **Inefficiency of TS in high - dimensional space**: - It is difficult for TS to sample effectively in high - dimensional continuous space. Due to the "curse of dimensionality", as the dimension increases, the probability that randomly selected candidate points fall near the optimal solution decreases exponentially. 2. **Mismatch between TS and BO requirements**: - In Bayesian Optimization, a Gaussian Process (GP) is usually used to model the objective function. TS samples directly from the probability distribution of the optimal solution, but this method is not effective in practical applications, especially in high - dimensional cases. 3. **Limitations of existing methods**: - Existing improvement methods such as P - Star Sampler (PSS), although improved, have high computational complexity and require parameter adjustment. #### Solutions To solve the above problems, the author proposes Stagger Thompson Sampler (STS), an improved Thompson Sampling algorithm, which mainly improves performance in the following ways: 1. **Improved initialization**: - STS is not randomly initialized with candidate points, but is initialized based on the maximum value of the mean function of the current GP model, that is, \( \tilde{x}^*=\arg\max_x\mu(x) \). 2. **Improved perturbation strategy**: - A log - uniform distribution is used to generate perturbation distances instead of the traditional uniform distribution. This allows the perturbation length to be adaptively adjusted according to the characteristics of the objective function. 3. **Efficient sampling strategy**: - STS efficiently generates samples closer to the optimal solution through the Metropolis - Hastings acceptance / rejection mechanism, thereby improving the sampling accuracy and efficiency. 4. **Applicable to high - dimensional problems**: - STS performs well on high - dimensional problems without special adjustment of the algorithm. 5. **Combined with MTV for batch optimization**: - STS is combined with Minimal Terminal Variance (MTV) to design effective batch experiments, further improving the optimization effect. Through these improvements, STS is not only superior to TS and other traditional methods in single - point sampling, but also performs well in batch optimization.

Fast, Precise Thompson Sampling for Bayesian Optimization

Epsilon-Greedy Thompson Sampling to Bayesian Optimization

Thompson sampling with the online bootstrap

Efficient and Adaptive Posterior Sampling Algorithms for Bandits

Optimality of Thompson Sampling with Noninformative Priors for Pareto Bandits

TS-RSR: A provably efficient approach for batch bayesian optimization

Posterior Sampling-Based Bayesian Optimization with Tighter Bayesian Regret Bounds

Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays

Thompson Sampling in Switching Environments with Bayesian Online Change Point Detection

Thompson Sampling For Combinatorial Bandits: Polynomial Regret and Mismatched Sampling Paradox

Learning to Optimize via Posterior Sampling

The Choice of Noninformative Priors for Thompson Sampling in Multiparameter Bandit Models

Self-accelerated Thompson Sampling with Near-Optimal Regret Upper Bound

Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors.

Gaussian Process Thompson Sampling via Rootfinding

Thompson Sampling for Bandit Learning in Matching Markets

Optimizing Posterior Samples for Bayesian Optimization via Rootfinding

Variable Selection Via Thompson Sampling

Parallelizing Thompson Sampling

Thompson Sampling Guided Stochastic Searching on the Line for Deceptive Environments with Applications to Root-Finding Problems

Maillard Sampling: Boltzmann Exploration Done Optimally