Abstract:As is well known, average-cost optimality inequalities imply the existence of stationary optimal policies for Markov decision processes with average costs per unit time, and these inequalities hold under broad natural conditions. This paper provides sufficient conditions for the validity of the average-cost optimality equation for an infinite state problem with weakly continuous transition probabilities and with possibly unbounded one-step costs and noncompact action sets. These conditions also imply the convergence of sequences of discounted relative value functions to average-cost relative value functions and the continuity of average-cost relative value functions. As shown in this paper, the classic periodic-review setup-cost inventory control problem with backorders and convex holding/backlog costs satisfies these conditions. Therefore, the optimality inequality holds in the form of an equality with a continuous average-cost relative value function for this problem. In addition, the K-convexity of discounted relative value functions and their convergence to average-cost relative value functions, when the discount factor increases to 1, imply the K-convexity of average-cost relative value functions. This implies that average-cost optimal (s, S) policies for the inventory control problem can be derived from the average-cost optimality equation.

Another Set of Verifiable Conditions for Average Markov Decision Processes with Borel Spaces

On Average Optimality for Non-Stationary Markov Decision Processes in Borel Spaces

Average Optimality in Markov Decision Processes with Unbounded Rewards

Risk-Sensitive Average Markov Decision Processes in General Spaces

On Linear Programming for Constrained and Unconstrained Average-Cost Markov Decision Processes with Countable Action Spaces and Strictly Unbounded Costs

Risk-sensitive discounted Markov decision processes with unbounded reward functions and Borel spaces

Average-Cost MDPs with Infinite State and Action Sets: New Sufficient Conditions for Optimality Inequalities and Equations

Maximizing the probability of visiting a set infinitely often for a Markov decision process with Borel state and action spaces

Average cost optimal control under weak ergodicity hypotheses: Relative value iterations

Continuous Time Markov Decision Processes with Expected Discounted Total Rewards

Beyond discounted returns: Robust Markov decision processes with average and Blackwell optimality

A survey of recent results on continuous-time Markov decision processes

On the optimality equation for average cost Markov decision processes and its validity for inventory control

Continuous Time Markov Decision Processes with Nonuniformly Bounded Transition Rate: Expected Total Rewards

Stationary Almost Markov ε-Equilibria for Discounted Stochastic Games with Borel Spaces and Unbounded Payoffs

The Finiteness of the Reward Function and the Optimal Value Function in Markov Decision Processes

Relative Q-Learning for Average-Reward Markov Decision Processes with Continuous States

Zero-Sum Non-stationary Stochastic Games with the Long-Run Average Criterion

Analysis for Some Properties of Discrete Time Markov Decision Processes

Optimal Stationary Policies for a Class of Countable Markov Control Processes

Finding Optimal Memoryless Policies of POMDPs under the Expected Average Reward Criterion