A Pareto Dominance Principle for Data-Driven Optimization

Tobias Sutter,Bart P.G. Van Parys,Daniel Kuhn
2023-12-15
Abstract:We propose a statistically optimal approach to construct data-driven decisions for stochastic optimization problems. Fundamentally, a data-driven decision is simply a function that maps the available training data to a feasible action. It can always be expressed as the minimizer of a surrogate optimization model constructed from the data. The quality of a data-driven decision is measured by its out-of-sample risk. An additional quality measure is its out-of-sample disappointment, which we define as the probability that the out-of-sample risk exceeds the optimal value of the surrogate optimization model. An ideal data-driven decision should minimize the out-of-sample risk simultaneously with respect to every conceivable probability measure as the true measure is unkown. Unfortunately, such ideal data-driven decisions are generally unavailable. This prompts us to seek data-driven decisions that minimize the in-sample risk subject to an upper bound on the out-of-sample disappointment. We prove that such Pareto-dominant data-driven decisions exist under conditions that allow for interesting applications: the unknown data-generating probability measure must belong to a parametric ambiguity set, and the corresponding parameters must admit a sufficient statistic that satisfies a large deviation principle. We can further prove that the surrogate optimization model must be a distributionally robust optimization problem constructed from the sufficient statistic and the rate function of its large deviation principle. Hence the optimal method for mapping data to decisions is to solve a distributionally robust optimization model. Maybe surprisingly, this result holds even when the training data is non-i.i.d. Our analysis reveals how the structural properties of the data-generating stochastic process impact the shape of the ambiguity set underlying the optimal distributionally robust model.
Optimization and Control
What problem does this paper attempt to address?
The paper primarily aims to address a key challenge in data-driven optimization: how to construct an estimator of the optimal solution based on limited training data. Specifically, the authors propose a statistically optimal method to construct data-driven decisions to solve stochastic optimization problems. The core contributions of the paper can be summarized as follows: 1. **Theoretical Framework and Objectives**: - The paper defines a sufficiently general framework for constructing estimators of the optimal solution to stochastic optimization problems based on limited training data. - This framework includes a stochastic optimization problem representing the "true situation," a family of probability measures (capturing prior structural knowledge), and a stochastic process generating training samples. 2. **Quality Assessment of Data-Driven Decisions**: - The quality of data-driven decisions is measured by their "out-of-sample risk," i.e., their performance under the unknown true probability measure. - The ideal data-driven decision should minimize the out-of-sample risk under all possible probability measures, but this is usually unattainable. 3. **Alternative Quality Metrics**: - "In-sample risk" and "out-of-sample disappointment" are defined as alternative quality metrics. - Out-of-sample disappointment refers to the probability that the out-of-sample risk exceeds the in-sample risk, reflecting the likelihood of underestimating the true risk. 4. **Optimization Problem Construction**: - A multi-objective optimization problem is constructed to find data-driven predictors and prescriptors with optimal in-sample risk while constraining the decay rate of out-of-sample disappointment. - It is proven that under certain conditions, there exist Pareto dominant solutions that can simultaneously minimize all objective functions. 5. **Application of Statistical Principles**: - The large deviation principle is utilized to simplify the optimization problem, and it is shown that under specific conditions, the optimal data-driven decisions can be obtained by solving a distributionally robust optimization (DRO) problem. - It is demonstrated that this conclusion holds even if the original stochastic optimization problem is non-convex or the training data is non-i.i.d. 6. **Theoretical Generalization**: - Examples are provided to illustrate how these results can be applied to different data-generating stochastic processes, including i.i.d. processes, Markov chains, autoregressive processes, etc. - Practical guidelines for selecting the best decision model are proposed, applicable to various data-driven decision scenarios. In summary, the main objective of the paper is to overcome the challenges faced in data-driven decision-making, particularly how to construct high-quality decisions when the true probability measure is unknown. By introducing new quality metrics and optimization problems, the paper provides theoretical foundations and practical methods to address these issues.