Abstract:This paper is devoted to studying constrained continuous-time Markov decision processes (MDPs) in the class of randomized policies depending on state histories. The transition rates may be unbounded, the reward and costs are admitted to be unbounded from above and from below, and the state and action spaces are Polish spaces. The optimality criterion to be maximized is the expected discounted rewards, and the constraints can be imposed on the expected discounted costs. First, we give conditions for the nonexplosion of underlying processes and the finiteness of the expected discounted rewards/costs. Second, using a technique of occupation measures, we prove that the constrained optimality of continuous-time MDPs can be transformed to an equivalent (optimality) problem over a class of probability measures. Based on the equivalent problem and a so-called $\bar{w}$-weak convergence of probability measures developed in this paper, we show the existence of a constrained optimal policy. Third, by providing a linear programming formulation of the equivalent problem, we show the solvability of constrained optimal policies. Finally, we use two computable examples to illustrate our main results.

Convergence of Markov Decision Processes with Constraints and State-Action Dependent Discount Factors

Convergence of Controlled Models and Finite-State Approximation for Discounted Continuous-Time Markov Decision Processes with Constraints

Constrained Markov Decision Processes with First Passage Criteria

Constrained Total Undiscounted Continuous-Time Markov Decision Processes

Discounted Continuous-Time Markov Decision Processes with Constraints: Unbounded Transition and Loss Rates

Constrained Continuous-Time Markov Control Processes with Discounted Criteria

Denumerable Continuous-Time Markov Decision Processes with Multiconstraints on Average Costs

Markov Decision Processes with State-Dependent Discount Factors and Unbounded Rewards/costs.

Constrained Continuous-Time Markov Decision Processes with Average Criteria

Multiconstrained Finite-Horizon Piecewise Deterministic Markov Decision Processes with Unbounded Transition Rates

Discounted Continuous-Time Constrained Markov Decision Processes in Polish Spaces

Denumerable-state Continuous-Time Markov Decision Processes with Unbounded Transition and Reward Rates under the Discounted Criterion

Constrained Markov Decision Processes with Non-constant Discount Factor

Risk-sensitive Average Continuous-Time Markov Decision Processes with Unbounded Transition and Cost Rates.

Continuous-Time Markov Decision Processes with State-Dependent Discount Factors

The Risk Probability Criterion for Discounted Continuous-Time Markov Decision Processes

Risk-Sensitive Discounted Continuous-Time Markov Decision Processes with Unbounded Rates.

First Passage Markov Decision Processes with Constraints and Varying Discount Factors

Approximate Constrained Discounted Dynamic Programming with Uniform Feasibility and Optimality

Unbounded Cost Markov Decision Processes with Limsup and Liminf Average Criteria: New Conditions

Markov Decision Problems with Unbounded Transition Rates under Discounted-Cost Performance Criteria