Gradient Estimation for Smooth Stopping Criteria

Bernd Heidergott,Yijie Peng
DOI: https://doi.org/10.1017/apr.2022.7
2022-01-01
Abstract:We establish sufficient conditions for differentiability of the expected cost collected over a discrete-time Markov chain until it enters a given set. The parameter with respect to which differentiability is analysed may simultaneously affect the Markov chain and the set defining the stopping criterion. The general statements on differentiability lead to unbiased gradient estimators.
What problem does this paper attempt to address?