Abstract:We consider the problem of privacy protection in Reinforcement Learning (RL) algorithms that operate over population processes, a practical but understudied setting that includes, for example, the control of epidemics in large populations of dynamically interacting individuals. In this setting, the RL algorithm interacts with the population over $T$ time steps by receiving population-level statistics as state and performing actions which can affect the entire population at each time step. An individual's data can be collected across multiple interactions and their privacy must be protected at all times. We clarify the Bayesian semantics of Differential Privacy (DP) in the presence of correlated data in population processes through a Pufferfish Privacy analysis. We then give a meta algorithm that can take any RL algorithm as input and make it differentially private. This is achieved by taking an approach that uses DP mechanisms to privatize the state and reward signal at each time step before the RL algorithm receives them as input. Our main theoretical result shows that the value-function approximation error when applying standard RL algorithms directly to the privatized states shrinks quickly as the population size and privacy budget increase. This highlights that reasonable privacy-utility trade-offs are possible for differentially private RL algorithms in population processes. Our theoretical findings are validated by experiments performed on a simulated epidemic control problem over large population sizes.

Differentially Private Deep Model-Based Reinforcement Learning

Differentially Private Reinforcement Learning with Self-Play

Offline Reinforcement Learning with Differential Privacy

Differentially Private Reinforcement Learning with Linear Function Approximation

Preserving Expert-Level Privacy in Offline Reinforcement Learning

Adaptive Control of Differentially Private Linear Quadratic Systems

How Private Is Your RL Policy? An Inverse RL Based Analysis Framework

Locally Private Distributed Reinforcement Learning

Privacy Preserving Reinforcement Learning for Population Processes

Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards

MOReL : Model-Based Offline Reinforcement Learning

Offline Primal-Dual Reinforcement Learning for Linear MDPs

Preserving the Privacy of Reward Functions in MDPs through Deception

Federated Multi-Objective Reinforcement Learning

Policy-regularized Offline Multi-objective Reinforcement Learning

Scaling Pareto-Efficient Decision Making Via Offline Multi-Objective RL

Deceptive Reinforcement Learning for Privacy-Preserving Planning

No-regret Exploration in Shuffle Private Reinforcement Learning

A Two-Stage Multi-Objective Deep Reinforcement Learning Framework.

Differentially Private Reward Functions in Policy Synthesis for Markov Decision Processes

Security and Privacy Issues in Deep Reinforcement Learning: Threats and Countermeasures