Optimization-Driven Adaptive Experimentation

Ethan Che,Daniel R. Jiang,Hongseok Namkoong,Jimmy Wang
2024-11-08
Abstract:Real-world experiments involve batched & delayed feedback, non-stationarity, multiple objectives & constraints, and (often some) personalization. Tailoring adaptive methods to address these challenges on a per-problem basis is infeasible, and static designs remain the de facto standard. Focusing on short-horizon ($\le 10$) adaptive experiments, we move away from bespoke algorithms and present a mathematical programming formulation that can flexibly incorporate a wide range of objectives, constraints, and statistical procedures. We formulating a dynamic program based on central limit approximations, which enables the use of scalable optimization methods based on auto-differentiation and GPU parallelization. To evaluate our framework, we implement a simple heuristic planning method ("solver") and benchmark it across hundreds of problem instances involving non-stationarity, personalization, and multiple objectives & constraints. Unlike bespoke methods (e.g., Thompson sampling variants), our mathematical programming framework provides consistent gains over static randomized control trials and exhibits robust performance across problem instances.
Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the limitations of existing adaptive experimental methods in practical applications. Specifically, it aims to solve the following types of problems: 1. **Batch Feedback and Delayed Feedback**: Experiments in the real world usually involve batch processing and delayed feedback, rather than immediate updates after each observation. Moreover, due to infrastructure limitations, experiments usually only conduct a small number of large - batch samplings, and there are limited opportunities to update the sampling strategy. 2. **Non - stationarity**: The experimental environment in the real world is dynamically changing. For example, customer behavior may be different in different time periods (such as weekends and weekdays). This leads to changes in the objective function and constraints during the experiment. 3. **Multi - objective and Constraints**: In experiments, it is often necessary to optimize multiple objectives simultaneously and meet multiple constraints, such as budget limitations, ethical constraints, etc. These objectives and constraints may be competing with each other, increasing the complexity of optimization. 4. **Personalization Requirements**: Some experiments need to be personalized according to individual characteristics, which further increases the difficulty of experimental design. To solve these problems, the paper proposes a framework based on mathematical programming. By introducing the central limit theorem approximation and automatic differentiation techniques, the complex adaptive experimental problem is transformed into a solvable optimization problem. This framework can flexibly handle multiple objectives, constraints, and statistical processes in a short time (T ≤ 10), and achieve efficient adaptive experimental design through large - scale optimization methods. Specifically, the main contributions of the paper include: - Proposing a new view of mathematical programming that can flexibly incorporate various objectives and constraints. - Simplifying the dynamic programming problem by modeling the Gaussian distribution of sufficient statistics through the central limit theorem approximation. - Designing an algorithm based on model predictive control (MPC) - Residual Horizon Optimization (RHO) for adaptive experimental design in a short time. - Verifying the effectiveness and robustness of the proposed method on a large number of examples, especially performing well in non - stationarity and multi - objective optimization. Through these improvements, the paper provides a general and efficient method to deal with the challenges of adaptive experiments in the real world.