Stochastic Dynamic Linear Programming: A Sequential Sampling Algorithm for Multistage Stochastic Linear Programming
Harsha Gangammanavar,Suvrajeet Sen
DOI: https://doi.org/10.1137/19M1290735
IF: 2.763
2021-08-25
SIAM Journal on Optimization
Abstract:SIAM Journal on Optimization, Volume 31, Issue 3, Page 2111-2140, January 2021. Multistage stochastic programming deals with operational and planning problems that involve a sequence of decisions over time while responding to an uncertain future. Algorithms designed to address multistage stochastic linear programming (MSLP) problems often rely upon scenario trees to represent the underlying stochastic process. When this process exhibits stagewise independence, sampling-based techniques, particularly the stochastic dual dynamic programming algorithm, have received wide acceptance. However, these sampling-based methods still operate with a deterministic representation of the problem which uses the so-called sample average approximation. In this work, we present a sequential sampling approach for MSLP problems that allows the decision process to assimilate newly sampled data recursively. We refer to this method as the stochastic dynamic linear programming (SDLP) algorithm. Since we use sequential sampling, the algorithm does not necessitate a priori representation of uncertainty, through either a scenario tree or sample average approximation, both of which require a knowledge/estimation of the underlying distribution. This method constitutes a generalization of the stochastic decomposition algorithm for two-stage stochastic linear programming models. The approximations used within SDLP may be viewed either through the lens of proximal methods or via regularization. Furthermore, we introduce the notion of basic feasible policies which provide a piecewise affine solution discovery scheme, which is embedded within the optimization algorithm to identify incumbent solutions used in the context of proximal iterations. Finally, we show that the SDLP algorithm provides a sequence of decisions and corresponding value function estimates along a sequence of state trajectories that asymptotically converge to their optimal counterparts, with probability one.
mathematics, applied