Abstract:The paper studies information-theoretic opacity, an information-flow privacy property, in a setting involving two agents: A planning agent who controls a stochastic system and an observer who partially observes the system states. The goal of the observer is to infer some secret, represented by a random variable, from its partial observations, while the goal of the planning agent is to make the secret maximally opaque to the observer while achieving a satisfactory total return. Modeling the stochastic system using a Markov decision process, two classes of opacity properties are considered -- Last-state opacity is to ensure that the observer is uncertain if the last state is in a specific set and initial-state opacity is to ensure that the observer is unsure of the realization of the initial state. As the measure of opacity, we employ the Shannon conditional entropy capturing the information about the secret revealed by the observable. Then, we develop primal-dual policy gradient methods for opacity-enforcement planning subject to constraints on total returns. We propose novel algorithms to compute the policy gradient of entropy for each observation, leveraging message passing within the hidden Markov models. This gradient computation enables us to have stable and fast convergence. We demonstrate our solution of opacity-enforcement control through a grid world example.

Information Relaxation and Dual Formulation of Controlled Markov Diffusions

On the Primal and Dual Formulations of Traffic Assignment Problems with Perception Stochasticity and Demand Elasticity

Solving the Dual Problems of Dynamic Programs via Regression

Duality Between Large Deviation Control and Risk-Sensitive Control for Markov Decision Processes.

Information Relaxation and A Duality-Driven Algorithm for Stochastic Dynamic Programs

Convex duality for stochastic singular control problems

Information-Theoretic Opacity-Enforcement in Markov Decision Processes

Dual Solutions in Convex Stochastic Optimization

Acceptable risks and related decision problems with multiple risk-averse agents

Controlled Diffusions under Full, Partial and Decentralized Information: Existence of Optimal Policies and Discrete-Time Approximations

Convex Q Learning in a Stochastic Environment: Extended Version

On Maximizing Probabilities for Over-Performing a Target for Markov Decision Processes

Relaxed formulation for Controlled Branching Diffusions, Existence of an Optimal Control and HJB Equation

Rectangularity and duality of distributionally robust Markov Decision Processes

On the hierarchical risk-averse control problems for diffusion processes

Strong Duality in Risk-Constrained Nonconvex Functional Programming

Weak Equilibriums for Time-Inconsistent Stopping Control Problems, with Applications to Investment-Withdrawal Decision Model

Relaxed Equilibria for Time-Inconsistent Markov Decision Processes

Information-Theoretic Minimax Regret Bounds for Reinforcement Learning based on Duality

Markov Decision Processes with Incomplete Information and Semi-Uniform Feller Transition Probabilities

Dual Representation of Unbounded Dynamic Concave Utilities