Tobias Sutter,Bart P.G. Van Parys,Daniel Kuhn
Abstract:We propose a statistically optimal approach to construct data-driven decisions for stochastic optimization problems. Fundamentally, a data-driven decision is simply a function that maps the available training data to a feasible action. It can always be expressed as the minimizer of a surrogate optimization model constructed from the data. The quality of a data-driven decision is measured by its out-of-sample risk. An additional quality measure is its out-of-sample disappointment, which we define as the probability that the out-of-sample risk exceeds the optimal value of the surrogate optimization model. An ideal data-driven decision should minimize the out-of-sample risk simultaneously with respect to every conceivable probability measure as the true measure is unkown. Unfortunately, such ideal data-driven decisions are generally unavailable. This prompts us to seek data-driven decisions that minimize the in-sample risk subject to an upper bound on the out-of-sample disappointment. We prove that such Pareto-dominant data-driven decisions exist under conditions that allow for interesting applications: the unknown data-generating probability measure must belong to a parametric ambiguity set, and the corresponding parameters must admit a sufficient statistic that satisfies a large deviation principle. We can further prove that the surrogate optimization model must be a distributionally robust optimization problem constructed from the sufficient statistic and the rate function of its large deviation principle. Hence the optimal method for mapping data to decisions is to solve a distributionally robust optimization model. Maybe surprisingly, this result holds even when the training data is non-i.i.d. Our analysis reveals how the structural properties of the data-generating stochastic process impact the shape of the ambiguity set underlying the optimal distributionally robust model.
What problem does this paper attempt to address?
The paper primarily aims to address a key challenge in data-driven optimization: how to construct an estimator of the optimal solution based on limited training data. Specifically, the authors propose a statistically optimal method to construct data-driven decisions to solve stochastic optimization problems. The core contributions of the paper can be summarized as follows:
1. **Theoretical Framework and Objectives**:
- The paper defines a sufficiently general framework for constructing estimators of the optimal solution to stochastic optimization problems based on limited training data.
- This framework includes a stochastic optimization problem representing the "true situation," a family of probability measures (capturing prior structural knowledge), and a stochastic process generating training samples.
2. **Quality Assessment of Data-Driven Decisions**:
- The quality of data-driven decisions is measured by their "out-of-sample risk," i.e., their performance under the unknown true probability measure.
- The ideal data-driven decision should minimize the out-of-sample risk under all possible probability measures, but this is usually unattainable.
3. **Alternative Quality Metrics**:
- "In-sample risk" and "out-of-sample disappointment" are defined as alternative quality metrics.
- Out-of-sample disappointment refers to the probability that the out-of-sample risk exceeds the in-sample risk, reflecting the likelihood of underestimating the true risk.
4. **Optimization Problem Construction**:
- A multi-objective optimization problem is constructed to find data-driven predictors and prescriptors with optimal in-sample risk while constraining the decay rate of out-of-sample disappointment.
- It is proven that under certain conditions, there exist Pareto dominant solutions that can simultaneously minimize all objective functions.
5. **Application of Statistical Principles**:
- The large deviation principle is utilized to simplify the optimization problem, and it is shown that under specific conditions, the optimal data-driven decisions can be obtained by solving a distributionally robust optimization (DRO) problem.
- It is demonstrated that this conclusion holds even if the original stochastic optimization problem is non-convex or the training data is non-i.i.d.
6. **Theoretical Generalization**:
- Examples are provided to illustrate how these results can be applied to different data-generating stochastic processes, including i.i.d. processes, Markov chains, autoregressive processes, etc.
- Practical guidelines for selecting the best decision model are proposed, applicable to various data-driven decision scenarios.
In summary, the main objective of the paper is to overcome the challenges faced in data-driven decision-making, particularly how to construct high-quality decisions when the true probability measure is unknown. By introducing new quality metrics and optimization problems, the paper provides theoretical foundations and practical methods to address these issues.