Abstract:Active learning parallelization is widely used, but typically relies on fixing the batch size throughout experimentation. This fixed approach is inefficient because of a dynamic trade-off between cost and speed -- larger batches are more costly, smaller batches lead to slower wall-clock run-times -- and the trade-off may change over the run (larger batches are often preferable earlier). To address this trade-off, we propose a novel Probabilistic Numerics framework that adaptively changes batch sizes. By framing batch selection as a quadrature task, our integration-error-aware algorithm facilitates the automatic tuning of batch sizes to meet predefined quadrature precision objectives, akin to how typical optimizers terminate based on convergence thresholds. This approach obviates the necessity for exhaustive searches across all potential batch sizes. We also extend this to scenarios with constrained active learning and constrained optimization, interpreting constraint violations as reductions in the precision requirement, to subsequently adapt batch construction. Through extensive experiments, we demonstrate that our approach significantly enhances learning efficiency and flexibility in diverse Bayesian batch active learning and Bayesian optimization applications.

What problem does this paper attempt to address?

This paper attempts to solve the problem of batch selection in Active Learning (AL). Specifically, the paper focuses on how to dynamically adjust the batch size during the experiment to balance the trade - off between cost and speed. Traditional methods usually fix the batch size throughout the experiment, but this method is inefficient because there is a dynamic trade - off between batch size and cost and speed: a larger batch is more expensive, while a smaller batch will lead to a slower running time. Moreover, this trade - off may change as the experiment progresses (for example, a larger batch may be preferable in the early stages of the experiment). To address this challenge, the paper proposes a new framework based on Probabilistic Numerics (PN) that can adaptively change the batch size. By regarding batch selection as a quadrature task, the algorithm proposed in the paper can automatically adjust the batch size according to a predefined quadrature accuracy target, similar to the way a typical optimizer terminates according to a convergence threshold. This method avoids the need for an exhaustive search of all potential batch sizes. In addition, the paper extends this method to constrained active learning and constrained optimization scenarios, interpreting constraint violations as a reduction in accuracy requirements, thereby further adjusting batch construction. Through extensive experiments, the paper demonstrates the significant advantages of this method in various Bayesian batch active learning and Bayesian optimization applications, improving learning efficiency and flexibility. In summary, the main contributions of the paper are as follows: 1. **Adaptive batch size**: By re - regarding batch construction as Kernel Quadrature (KQ) and fixing the quadrature accuracy, the batch size can be adaptively adjusted according to the changes in the acquisition function. 2. **Handling under unknown constraints**: Re - interpret the handling of batch active learning under unknown constraints as changing accuracy requirements, allowing the batch size and location to be adaptively adjusted according to the risk of constraint violation. 3. **Generality**: The proposed adaptive batch construction scheme is applicable to Active Learning (AL), Bayesian Optimization (BO) and Bayesian Quadrature (BQ), and can be applied to non - continuous domains (such as combinatorial domains, mixed feature spaces). 4. **Significant improvement**: It performs well in batch AL and batch BO tasks, outperforming 17 baseline methods and achieving excellent performance in 6 synthetic tasks and 7 real - world tasks. 5. **Open - source**: The relevant software has been open - sourced on GitHub. These contributions make the method proposed in the paper significant in improving the efficiency and flexibility of active learning and optimization tasks.

Adaptive Batch Sizes for Active Learning A Probabilistic Numerics Approach

Sampling-Based Adaptive Bayesian Quadrature for Probabilistic Model Updating

Batch Mode Active Learning for Efficient Parameter Estimation

A Quadrature Approach for General-Purpose Batch Bayesian Optimization via Probabilistic Lifting

AdAdaGrad: Adaptive Batch Size Schemes for Adaptive Gradient Methods

Small-sample Size Problems Solving Based on Incremental Learning: an Adaptive Bayesian Quadrature Approach

Dynamic Batch Adaptation

Adaptive Experimentation at Scale: A Computational Framework for Flexible Batches

SOBER: Highly Parallel Bayesian Optimization and Bayesian Quadrature over Discrete and Mixed Spaces

Stochastic batch size for adaptive regularization in deep network optimization

Batch Bayesian Optimization via Expected Subspace Improvement

Parallelized Acquisition for Active Learning using Monte Carlo Sampling

Quantifying the mini-batching error in Bayesian inference for Adaptive Langevin dynamics

Adaptive Experiment Design for Probabilistic Integration

Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification

AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks

Stochastic Proximal Gradient Algorithm with Minibatches. Application to Large Scale Learning Models

Refined parallel adaptive Bayesian quadrature for estimating small failure probabilities

Dynamic Batch Bayesian Optimization

An Adaptive Batch Bayesian Optimization Approach for Expensive Multi-Objective Problems